Hardware Intensity of Crypto Mining vs AI Generation
Executive summary
Crypto mining based on proof‑of‑work (PoW) and modern AI generation both depend on highly specialized compute infrastructure, but they concentrate hardware and energy in very different ways. Bitcoin mining alone consumed on the order of 120–175 TWh per year in 2023–2025, roughly 0.2–0.9% of global electricity use, with more recent Cambridge estimates around 138 TWh and 0.5% of global demand. Global data centres, by contrast, consumed about 415 TWh in 2024 (around 1.5% of world electricity), with the International Energy Agency (IEA) projecting a rise to about 945 TWh by 2030 and identifying AI as the main driver of that growth. On a per‑operation basis, PoW mining deliberately burns energy in an open‑ended hash race, while AI hardware delivers orders of magnitude more useful computation per joule but is deployed across rapidly scaling fleets of accelerators.[1][2][3][4][5][6][7][8]
The hardware intensity of PoW crypto is characterized by highly application‑specific ASICs measured in joules per terahash, continuously run at high utilization in geographically concentrated sites, and periodically rendered obsolete by difficulty adjustments and hardware advances. In AI, the intensity is driven by multi‑purpose accelerators such as Nvidia H100 GPUs and Google TPUs delivering up to thousands of TFLOPs at hundreds of watts per chip, with massive capital expenditures on data‑centre capacity and large, though highly variable, energy footprints per model training run and per inference. The transition of major blockchains such as Ethereum from PoW to proof‑of‑stake (PoS), which cut Ethereum’s energy use by roughly 99.95–99.98%, demonstrates that high hardware intensity is a design choice of particular consensus mechanisms rather than an intrinsic property of “blockchain” per se, whereas rising AI hardware intensity is more tightly coupled to demand for increasingly capable models and services.[9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]
"Hardware intensity" can be understood along several complementary dimensions:
Power intensity: instantaneous power draw per device and aggregate power draw per facility or network (e.g., GW for Bitcoin, MW for AI data centres).[2][3][24]
Energy per unit of function: joules per terahash for mining, joules per token or per query for AI inference, and MWh per model training run.[11][15][18][25][26][9]
Capital intensity: cost and volume of specialized hardware required to sustain a given level of activity (ASIC farms vs GPU/TPU clusters), often reflected in multi‑hundred‑billion‑dollar annual data‑centre capex for AI.[13][14][27][2]
Lifecycle and e‑waste: hardware churn rates and associated waste streams, such as kilotonnes of retired ASICs annually versus multi‑generation reuse of accelerators in AI clusters.[28][1]
The following sections compare these dimensions for PoW crypto mining and AI generation, focusing on current empirical evidence and structural drivers.
Crypto mining hardware and energy footprint
Network‑level electricity consumption
The U.S. Energy Information Administration (EIA) reports that the Cambridge Bitcoin Electricity Consumption Index (CBECI) estimated global Bitcoin mining power demand at the end of January 2024 at around 19 GW, with a plausible range from 9.1 to 44 GW. Multiplying these power levels by hours in a year yields an annual electricity demand range of roughly 80–390 TWh, with a point estimate of about 170 TWh for Bitcoin alone, corresponding to roughly 0.2–0.9% of global electricity demand. Cambridge’s 2025 Digital Mining Industry work and derivative summaries similarly place Bitcoin’s 2024 electricity consumption around 138 TWh, or about 0.5–0.54% of global electricity consumption. Independent commentators drawing on CBECI data describe Bitcoin’s current annual electricity use in 2025 as stabilizing around 175 TWh, also about 0.5% of global demand and comparable to the consumption of a mid‑sized country.[3][5][7][1][28]
These estimates highlight that a single PoW asset can command an energy footprint similar in magnitude to that of the entire current global data‑centre sector a decade ago, even before including other PoW chains.[5][6][1][3]
Hardware specialization and efficiency metrics
Bitcoin mining hardware has evolved from CPUs and GPUs to highly specialized ASICs optimized solely for the SHA‑256 hashing algorithm. The Cambridge Digital Mining Industry report estimates that average network‑wide mining hardware efficiency improved by about 24% year‑on‑year, reaching around 28.2 joules per terahash (J/TH) by mid‑2024. This reflects an industry‑wide shift toward newer ASIC generations such as Bitmain’s Antminer S21 series, which advertises efficiencies of 17.5 J/TH for the 200 TH/s air‑cooled model and 16 J/TH for the hydro‑cooled 335 TH/s variant.[29][15][18][11][28]
At these specifications, an air‑cooled S21 drawing 3.5 kW at 200 TH/s and a hydro‑cooled S21 drawing about 5.36 kW at 335 TH/s both substantially outperform prior‑generation S19 devices that operated above 20 J/TH, but they remain single‑purpose devices whose value is tightly coupled to Bitcoin’s price and network difficulty. Industry disclosures from large listed miners such as Core Scientific indicate fleet‑wide average efficiencies around 25–32 J/TH as they progressively replace older machines with S21‑class hardware, underscoring how aggregate network efficiency lags the leading edge due to deployment constraints and capex cycles.[15][27][22][11][29]
Market concentration and deployment patterns
Survey data from the Cambridge Digital Mining Industry work indicate that a handful of manufacturers dominate the ASIC market, with Bitmain controlling more than 80% of the Bitcoin ASIC installed base, followed by MicroBT and Canaan. Geographically, mining hashrate is heavily concentrated in North America, with recent Cambridge surveys suggesting that around three‑quarters of reported hashrate in the sample originates from the United States, with Canada a distant second. This concentration reflects access to relatively cheap or stranded energy, hospitable regulation, and the ability to finance large‑scale industrial sites.[1][5]
Operationally, mining farms are designed to run ASICs at or near full utilization 24/7 in order to maximize revenue per unit of capex, often in remote regions with abundant hydropower, wind, or natural‑gas‑fired electricity. This creates large, relatively inflexible loads that can strain local grids or, in some contexts, provide flexible demand response if integrated with grid operators.[3][28][1]
Energy mix, emissions, and e‑waste
Cambridge’s recent work estimates that around 52.4% of Bitcoin mining’s electricity comes from sustainable sources when combining renewables and nuclear, with roughly 42.6% from renewables alone and the remainder largely from natural gas and coal. Even with this mix, the same work estimates network‑wide emissions near 39.8 million tonnes of CO₂‑equivalent per year, or around 0.08% of global greenhouse‑gas emissions. Other analyses based on older methodologies yield somewhat higher emission estimates but broadly agree that mining’s footprint is non‑trivial relative to its narrow financial use case.[30][21][5][1]
On the hardware‑waste side, Cambridge’s Digital Mining report projects about 2.3 kilotonnes of e‑waste from retired Bitcoin mining equipment in 2024, a figure that may be lower than earlier alarmist estimates but still significant given the narrow functionality of ASICs. Because ASICs cannot be repurposed for general compute or AI workloads, obsolescent mining hardware typically has little residual value beyond recycling, making crypto mining hardware intensity particularly rigid and path‑dependent.[28][1]
AI generation hardware and energy footprint
Data‑centre electricity demand and AI’s share
The IEA estimates that global data centres consumed about 415 TWh of electricity in 2024, equivalent to roughly 1.5% of global electricity consumption. In the United States, data centres accounted for approximately 183 TWh in 2024—more than 4% of national electricity use and roughly equal to the annual demand of Pakistan. Across 2019–2024, data‑centre electricity consumption has grown at around 12% per year globally, with the surge in AI workloads since 2022 singled out as the primary driver of future demand growth.[4][6][8][24][2]
IEA modelling suggests that data‑centre electricity usage could more than double to around 945 TWh by 2030, with AI workloads dominating incremental demand and making data centres responsible for a sizable share of total electricity demand growth in advanced economies. Industry analyses for 2025–2026 suggest that global capex on AI‑oriented data‑centre infrastructure, including GPU and accelerator purchases, is reaching several hundred billion dollars annually, with one estimate citing about 580 billion dollars of AI‑focused data‑centre investment in 2025 alone. This underscores that AI hardware intensity is not just about watts and joules, but also about extraordinary capital concentration.[6][8][2][4]
Accelerator hardware characteristics
Modern AI generation relies on highly parallel accelerators—GPUs, TPUs, and custom ASICs—that deliver massive floating‑point throughput at significant power draw. Nvidia’s H100 Tensor Core GPU, widely used for training and inference of frontier models, offers up to roughly 3,958 FP8 tensor TFLOPs and 1,979 FP16 tensor TFLOPs in its SXM configuration, paired with around 80 GB of high‑bandwidth memory and up to 3.35 TB/s of memory bandwidth. The maximum thermal design power (TDP) for H100 SXM is specified at up to about 700 W, while PCIe variants operate at 300–400 W TDP, meaning that an 8‑GPU server can easily draw several kilowatts at full load even before accounting for CPUs, networking, and cooling.[31][32][33][16][34][13]
Comparative analyses of accelerators report that Google’s TPU v4 delivers up to around 275 TFLOPs, while Nvidia’s earlier‑generation A100 GPU offers roughly 156 TFLOPs, with newer TPU generations (such as TPU v5) reaching about 460 TFLOPs mixed‑precision. These sources and architectural surveys converge on the view that TPUs typically deliver about 2–3 times better performance per watt than contemporary GPUs for many AI workloads, while both drastically outperform CPUs on energy‑normalized throughput, where GPUs can provide tens of times higher TFLOPs per watt than general‑purpose server CPUs.[12][14][33][35]
Training energy for frontier models
Because training large models is a one‑off but very intensive process, its hardware intensity is best measured in MWh per training run. Multiple independent analyses of the GPT‑3 175‑billion‑parameter model estimate that training consumed on the order of 1,287 MWh of electricity, corresponding to the annual electricity use of around 100–120 average U.S. homes and emitting roughly 550 tonnes of CO₂ under typical grid mixes. Academic and industry commentaries emphasize that this figure excludes energy spent on failed experiments, hyperparameter sweeps, architecture searches, and data‑processing pipelines, which can multiply the effective training‑phase energy footprint by factors of 20–100 for frontier models.[36][10][37][9]
Despite these large one‑time costs, training energy can be amortized over billions of subsequent inferences; if a model serving stack handles billions of queries over its lifetime, the training‑phase joules per query can be small relative to inference‑phase energy. This differs fundamentally from PoW mining, where each hash attempt is an expendable, non‑amortizable computation whose only purpose is to win the next block.[10][38]
Inference energy per query and per token
Recent work has shifted attention from training to inference energy, since deployed models may serve millions to billions of queries per day. A 2025 methodology paper on large‑scale LLM inference estimates the median energy per query for frontier‑scale models (over 200 billion parameters) running on well‑utilized H100 nodes at around 0.34 Wh in a "traditional" conversational regime with a median of 300 output tokens. Under "test‑time scaling" scenarios with much longer outputs (median 5,000 tokens), the same study estimates median energy per query of roughly 4.3 Wh—about 13 times higher—illustrating how output length and reasoning depth strongly affect hardware intensity per interaction.[25]
Complementary benchmarking work introduces energy‑centric metrics such as joules per token and joules per response for LLM inference and shows that energy per token can vary by factors of two to three across models of similar size, depending on architecture and implementation details. Experiments comparing FP16 and FP8 quantization for large models (for example, Llama‑class 405‑billion‑parameter systems) indicate that moving to FP8 can reduce energy per token by roughly 30% under heavy load, primarily by lowering memory traffic and arithmetic cost, while advanced batching regimes further reduce joules per token as GPU utilization improves.[26][39][40]
Comparative hardware intensity: crypto vs AI
At the macro level, current best estimates imply that Bitcoin mining alone consumes on the order of 120–175 TWh per year, around 0.5% of global electricity use, with plausible bounds of 80–390 TWh depending on assumptions and time window. Global data centres, encompassing AI workloads, cloud computing, storage, and networking, consumed about 415 TWh in 2024, or roughly 1.5% of world electricity use, and could reach around 945 TWh by 2030 if current trends continue. In the United States, data centres already consume more electricity annually than many entire countries, at about 183 TWh in 2024, and are expected to more than double their demand by 2030, with AI as a central driver.[7][8][24][2][4][5][6][1][3]
This implies that, in absolute terms, today’s global data‑centre sector already uses more electricity than Bitcoin mining but delivers a vastly broader set of services, from cloud hosting and video streaming to AI training and inference, whereas Bitcoin’s PoW network uses a substantial fraction of that power budget for a single payment and store‑of‑value system.[8][2][6][3]
For crypto mining, a standard unit of functionality is a successful block or transaction, though the underlying work is measured in hashes. CBECI‑based analyses put Bitcoin’s annual electricity use at approximately 175 TWh and estimate that each on‑chain transaction effectively corresponds to on the order of 1,400 kWh of electricity, enough to power a typical American household for several weeks; this figure reflects the fact that mining is largely independent of transaction count and instead tied to network difficulty and price. From a hardware‑intensity perspective, each joule of mining energy secures the ledger by making double‑spend attacks more expensive, but does not directly generate additional useful output beyond consensus.[7][3]
For AI generation, functional units are more varied: tokens answered, images generated, or tasks completed. The inference study cited earlier suggests that for a frontier LLM, a typical conversational query with a few hundred output tokens may consume around 0.3–0.5 Wh, while highly extended reasoning sequences might consume a few Wh per query. Energy‑per‑token benchmarking shows that careful model and system design can move joules per token down by factors of 2–3, and quantization alone can cut energy per token by roughly 30%. Even adding amortized training costs, per‑query hardware intensity for large‑scale AI services is generally orders of magnitude lower than per‑transaction intensity in Bitcoin, albeit applied to a much broader base of activity.[39][9][10][25][26]
Specialization, reuse, and opportunity cost
Bitcoin ASICs are highly specialized: an S21 miner cannot be repurposed for AI workloads, scientific computing, or general cloud services, and becomes economically obsolete when its joules per terahash fall behind newer machines given prevailing electricity prices and Bitcoin rewards. This makes crypto mining hardware intensity relatively inelastic: once capital is sunk into ASICs and containers, the only economically rational choice is to run them near continuously until marginal revenue falls below marginal cost, at which point equipment is idled or scrapped.[27][11][15][3][28]
AI accelerators, by contrast, are flexible: an H100 or TPU v4 fleet can be reallocated among model training, inference, traditional HPC, and even non‑AI workloads, allowing data‑centre operators to chase higher‑value uses for each joule of compute. The same hardware investment can thus support a rotating portfolio of models and applications, from recommendation systems to scientific simulations, making AI hardware intensity more elastic and arguably more productive per unit of energy and capex than PoW mining hardware.[14][35][2][12][13]
Consensus design and the Ethereum case
Proof‑of‑stake as a hardware‑light alternative
Ethereum’s transition from PoW to PoS in 2022 provides an empirical natural experiment in consensus‑mechanism hardware intensity. Pre‑Merge, Ethereum’s PoW network consumed energy on the order of tens of TWh per year; the Ethereum Foundation and subsequent analyses estimated that its energy use was comparable to that of a medium‑sized country. Post‑Merge, official Ethereum documentation and third‑party estimates indicate that switching to PoS reduced Ethereum’s energy consumption by approximately 99.95–99.98%, leaving it at around 0.01 TWh per year and roughly 0.01 Mt CO₂ annually.[17][19][20][21][23]
Because PoS selects validators based on staked economic value rather than computational work, its hardware requirements are modest: commodity servers can validate blocks with minimal incremental power use compared with mining farms, eliminating the need for dedicated ASIC fleets. This illustrates that the extreme hardware and energy intensity observed in Bitcoin is a function of PoW’s design rather than a necessary condition for decentralized consensus, and that alternative designs can effectively de‑materialize the hardware footprint of a major smart‑contract platform while preserving functionality.[20][21][23]
Implications for crypto hardware trajectories
The Ethereum example suggests that, for new or existing chains prioritizing sustainability, consensus changes can reduce hardware intensity by orders of magnitude, shifting costs from electricity and devices toward financially staked capital and protocol design. For Bitcoin, however, social and political constraints make a shift away from PoW unlikely in the near term, meaning that improvements in ASIC efficiency (J/TH) and shifts toward cleaner energy sources will be the primary levers for reducing environmental impact rather than fundamental changes in hardware demand.[18][21][23][30][17][28]
Supply chains, geographic concentration, and systemic risk
On
the crypto side, Cambridge’s survey‑based analysis indicates
that Bitmain alone commands more than four‑fifths of the ASIC
market, creating a highly concentrated supply chain whose
manufacturing is geographically clustered in East Asia.
Network
hashrate is likewise concentrated, with a large share of mining
capacity located in the United States and Canada following regulatory
shifts in China, exposing the ecosystem to jurisdiction‑specific
policy risks and local grid constraints.[1]
[5][28][1]
In AI, the hardware supply chain is dominated by a small number of chip designers and cloud providers. Nvidia effectively controls the high‑end GPU market used for LLMs, as evidenced by the centrality of A100, H100, and H200 GPUs in accelerator comparison tables, while Google’s TPUs are vertically integrated into its own cloud. These accelerators depend on advanced semiconductor fabrication nodes operated by a handful of foundries, so AI’s hardware intensity interacts with geopolitical risk in a different but equally significant way as crypto’s ASIC dependence.[33][35][13][14]
From an energy‑system perspective, crypto mining loads can, in principle, be colocated with remote renewable or otherwise stranded energy sources, and there is evidence of off‑grid operations using curtailed hydropower or flared gas to power miners. AI‑oriented data centres, however, have stronger dependencies on low‑latency connectivity, grid reliability, and proximity to major user bases, leading to clustering near metropolitan hubs where grid capacity is already stressed.[24][2][8][30][28][1]
Efficiency improvements and rebound effects
Both domains are pursuing aggressive efficiency improvements, but with different mechanisms and rebound dynamics. In Bitcoin mining, hardware efficiency measured in J/TH has improved by roughly 20–30% per year in recent periods, driven by new ASIC generations such as the S21 series and by optimizations in cooling and facility design. However, because the protocol automatically adjusts difficulty to maintain a roughly constant block interval, increases in hardware efficiency mostly translate into higher global hashrate rather than proportional energy savings, unless miners voluntarily cap expenditure or face strict external constraints.[11][15][18][3][28]
In AI, efficiency gains occur at multiple layers: algorithmic innovations (e.g., better architectures, pruning, distillation), system‑level optimizations (e.g., batching, serving architectures), and hardware advances (e.g., higher TFLOPs per watt in new GPU/TPU generations). Studies of inference energy indicate that combining quantization, improved utilization, and smarter routing can plausibly reduce joules per query by factors of 8–20 relative to naive baselines, even as model sizes grow. At the same time, commentary on the "Jevons paradox" in AI notes that efficiency gains may spur greater total usage—more queries, broader deployment, and more demanding test‑time computation—so aggregate energy use can still rise despite lower energy per operation.[40][12][14][25][26][39]
Policy and design implications
From a policy standpoint, the hardware intensity of PoW crypto and AI generation raises overlapping but distinct concerns. For crypto, regulators and energy authorities focus on local grid impacts, emissions, and opportunity cost of devoting substantial electricity to a non‑productive hash race, prompting measures such as moratoria, energy‑use reporting, and differential tariffs in some jurisdictions. For AI and data centres, concerns extend beyond emissions to include transmission build‑out, land use, water consumption for cooling, and the risk that unconstrained AI‑driven demand could outpace grid expansion, crowding out electrification in other sectors.[2][4][6][8][24][3][28]
Design choices can mitigate or exacerbate hardware intensity. In crypto, moving from PoW to PoS or other low‑energy consensus schemes, or constraining difficulty growth through protocol design, can dramatically reduce hardware and energy requirements. In AI, prioritizing energy‑efficient architectures, deploying model compression and routing so that small models handle routine tasks, and aligning business incentives with energy‑aware metrics (e.g., joules per useful task) can steer the ecosystem toward higher value per unit of hardware and electricity.[21][23][17][25][26][39]
Overall, PoW crypto mining epitomizes a form of hardware intensity where energy expenditure is the security budget itself, making electricity and ASICs the core economic inputs to consensus. AI generation, by contrast, channels large but more malleable hardware investments into a wide variety of tasks, with strong scope for efficiency gains and reuse but also powerful demand‑side pressures that may, absent policy and design interventions, drive total hardware and energy consumption sharply upward.
Inside the 2025 Cambridge Digital Mining Report: Energy ... - The Cambridge Centre for Alternative Finance (CCAF) has released its first-ever Digital Mining Indus...
Growing Energy Demand of AI - Data Centers 2024–2026 - TTMS - AI is driving a surge in data center energy demand. Explore key data from 2024–2025 and forecasts fo...
Tracking electricity consumption from U.S. cryptocurrency mining ... - Energy Information Administration - EIA - Official Energy Statistics from the U.S. Government
Electricity Demand and Grid Impacts of AI Data Centers - arXiv
Cambridge study: sustainable energy rising in Bitcoin mining - The use of sustainable energy sources for Bitcoin mining has grown to 52.4% finds a new study by Cam...
Data Centers Will Use Twice as Much Energy by 2030—Driven by AI - The electricity consumption of data centres is projected to more than double by 2030, according to a...
Cost Analysis: Mining One... - Bitcoin mining consumes 175+ TWh annually - equivalent to Poland's electricity use. Complete 2025 an...
Global data center power demand to double by 2030 on AI surge: IEA - Global electricity demand from data centers is set to more than double to 945 TWh by 2030, equivalen...
Solutions to the AI Energy Demand | Tepperspectives - GenAI comes with high energy consumption and emissions, demanding businesses mitigate with clean ene...
How Much Energy Will It Take To Power AI? - Napkin math estimates on AI energy usage.
What is the Antminer S21? Everything to Know About Bitmain's ... - The Antminer S21 is the most efficient Bitcoin ASIC miner on the market, with an advertised efficien...
TPU vs GPU: Comprehensive Technical Comparison - Wevolver - This article explores TPU vs GPU differences in architecture, performance, energy efficiency, cost, ...
NVIDIA H100 Tensor Core GPU - Colfax Internationalwww.colfax-intl.com › nvidia › nvidia-h100 - Colfax systems based on NVIDIA H100 Tensor Core GPU enables an order-of-magnitude leap for large-sca...
TPU vs GPU: A Comprehensive Technical Comparison - Wevolver - This article explores TPU vs GPU differences in architecture, performance, energy efficiency, cost, ...
Antminer S21 Review: A Game-Changer in Bitcoin Mining Efficiency - The Antminer S21 is Bitmain's latest contribution to the Bitcoin mining hardware market, launched in...
H100 GPU - A Massive Leap in Accelerated Compute.
Ethereum's Proof of Stake Cuts Energy Consumption by 99.95% - PoW validators burn approximately 150 TWh annually while PoS validators consume just 0.0026 TWh—a 99...
Cambridge Digital Mining Industry Report: Global Operations ... - This Cambridge Digital Mining report examines crypto mining economics and energy use. Review methods...
Ethereum's PoS Transition Reduces Energy Consumption by 99.95% - The blockchain industry has reached a pivotal sustainability milestone that demands attention from e...
Explained: Proof-of-Work vs. Proof-of-Stake Carbon Footprint - With more networks opting for the proof-of-stake mechanism, it's quite possible that blockchain acti...
Bitcoin's Energy Consumption May Plateau by Next Halving - Bitcoin’s energy consumption may, for the first time, plateau—or even decline—by the next halving, t...
Proof-of-stake vs proof-of-work - Ethereum.org - A comparison between Ethereum's proof-of-stake and proof-of-work based consensus mechanism
US data centers' energy use amid the artificial intelligence boom - Data centers accounted for 4% of total U.S. electricity use in 2024. Their energy demand is expected...
[Revue de papier] Energy Use of AI Inference: Efficiency ... - This paper introduces a bottom-up methodology to estimate the per-query energy consumption of large-...
Core Scientific Announces April 2024 Production and Operations ... - ... efficiency, improving our average miner energy efficiency to 25.8 joules per terahash. We are al...
Bitcoin News Today: Bitcoin Miners' Green Arms Race - AInvest - Bitcoin News Today: Bitcoin Miners' Green Arms Race: Efficiency, Renewables, and ESG Drive a High-St...
Bitcoin miners double down on efficiency and renewable energy at the World Digital Mining Summit - Bitcoin mining efficiency, a pivot to using renewable energy and the debut of the Antminer S21 was t...
AI Accelerator Comparison Tables - Spill / Fill - As someone who often messes with AI hardware, I’ve always wanted a comparison between different AI a...
CPU vs GPU vs TPU vs NPU: AI Hardware Architecture Guide 2025 - Complete guide to CPU, GPU, TPU, and NPU architectures for AI. Learn optimization techniques, perfor...
Optimization could cut the carbon footprint of AI training by up to 75% - A new way to optimize the training of deep learning models, a rapidly evolving tool for powering art...
[PDF] Compression-Induced Communication-Efficient Large Model ... - arXiv
[PDF] Assessing the Energy Impact and Carbon Footprint of AI Model ...
[PDF] Advocating Energy-per-Token in LLM Inference - EuroMLSys
Diagnosing Inference Energy Consumption with the ML.ENERGY ... - ML.ENERGY research & tech blog
Comments
Post a Comment