My Take on LLM and Employment

LLM's and Employment?

Aug 21, 2025

Welcome to the age of LLMs. Half the population thinks LLMs think just because they implemented a <Loader/> after every prompt that says “thinking...”. I’m not going to lie— it does give the appearance of thinking.

Thinking - ideas or opinions about something.

AI Development = Speed + Thoughtful decisions? In my opinion using AI in my development life cycle did drastically improve at least one of the two while often contradicting each other. What I mean by this is, I often found myself being fast with it or being thoughtful with it and not both. So the balance was missing.

Have you ever had a super junior intern working under you, and no matter how much you explained, they’d just not understand? It’s not because you cannot explain it well enough, nor because they are dumb enough to not understand. It’s just that their brain has not yet been exposed to the problem at hand — and it is way above their current understanding of software. That is the state of LLMs these days. In order to make them understand, people started using RAG, fine-tuning, reinforcement learning, and so on, to improve responses. But with all this being implemented and discussed — at the end of the day, it is predicting, not thinking. The reason LLM's appear to be thinking is beacuse they are trained on some high level english grammar and have enough context to appear to have emotions. Below you can see how it predicts tokens depending on this controversial paragraph.

When I asked ChatGPT to fix the above paragraph and asked it to keep it controversial, it started with "Got it 😏"!

With all that being said, how is the employment scenario looking? Recently OpenAI launched its ₹399 / month plan in India and why is it so big? Let's do a hypothetical calculation which might be wrong but just to get an idea.

Definitions :

Parameters, precision, KV cache, FLOPs, latency, energy

Weights: GPU memory holding model parameters.
KV cache: per-token attention state stored during decoding.
FLOPs: total floating-point operations (2 × params × tokens used here).
Latency: estimated request processing time on the cluster.
Energy: power × time (reported in Wh and kJ).

Selected configuration :

Model	4.0 × 10¹² params (4T)
Weights precision	INT8 (1 byte / weight)
KV cache precision	FP16 (2 bytes / element)
Tokens (prompt / gen)	4,000 in / 500 out → 4,500 total
Hardware	H100 SXM 80 GB (assume 80 GB usable)
Throughput assumption	400 TFLOPS / GPU sustained
Power / GPU	600 W
Sharding	Assume ideal linear sharding across GPUs

Calculations :

Weights:

4,000,000,000,000 weights × 1 byte = 4,000 GB (≈4.0 TB)

KV Cache:

Assuming ~80 layers, 128 heads, 128 head_dim for 4T model

KV = 2 × batch × seq_len × layers × heads × head_dim × precision

KV = 2 × 1 × 4,500 × 80 × 128 × 128 × 2 bytes ≈ 11.8 GB

Overhead ≈ 15% of (weights + KV) ≈ 602 GB

Total memory ≈ 4,614 GB (≈4.61 TB)

Min H100s (80GB each) = ceil(4,614 / 80) = 58 GPUs

Total FLOPs = 2 × params × tokens = 36,000 TFLOPs

Cluster TFLOPS = 400 × 58 = 23,200 TFLOPs

Latency ≈ 36,000 / 23,200 ≈ 1.55 s

Energy = Power × Time = (600 × 58) W × 1.55 s = 53,736 J ≈ 14.9 Wh

Cost @ $0.12/kWh in USA ≈ $0.0018 / request

With all these calculations, according to reports India produces 1.5 Million engineers every year. If 50% of them subcribe to the mentioned plan from OpenAI that would be 750000 * ₹399. But why does this matter?

Scale Calculations :

Daily Volume:

750,000 users × 100 requests/day = 75,000,000 requests/day

Daily Energy Consumption:

75,000,000 requests × 0.015 kWh = 1,125,000 kWh = 1.125 GWh/day

Daily Electricity Cost:

75,000,000 requests × $0.0018 = $135,000/day

Processing Time Requirements:

75,000,000 × 1.5 seconds = 112.5 million seconds = 31,250 hours of compute time

With 60 GPU cluster: 31,250 ÷ 60 = 520.8 hours of continuous operation needed

That's 21.7 days of continuous operation to process one day's requests!

On average, a US citizen consumes about 10,791 kWh annually (2022). My instinct is that AI will create an artificial demand for energy that pushes up the natural baseline of consumption. In the short run, this could drive per-unit energy prices higher, since production needs to catch up, much like the early days of cloud computing. But over time, while total consumption will keep rising, the cost per task will fall as efficiency improves and economies of scale kick in. Just as cloud created entirely new roles, AI will also open up fresh opportunities, even as it pressures existing ones. The real challenge is to stay in sync with this progress — making sure AI remains a friend, not a foe, supporting us without disrupting the very jobs it was meant to empower.

NOTE : The above calculations are purely demonstrative and can be wrong in many ways, It is just to showcase that employment of currently laid off engineers is inevitable given that they don't upgrade their skills.

Art - From a different perspecitve