Data Vault and Telenor Pakistan have launched the nation's first dedicated AI data center in Karachi. It is designed to support startups, researchers, and government agencies with high-performance computing and GPU-as-a-service offerings. It is equipped with more than 3,000 Nvidia's highest performance H100 and H200 GPUs for which the Trump Administration issued export licenses. These GPUs cost from $40,000 to $60,000 each, making the Nvidia chips the biggest chunk of the investment made in this AI data center. Other data centers in Pakistan also support AI workloads but this new data center in Karachi is specially designed for AI. It puts the country on a short list of only a handful of nations with locally hosted AI data centers. Pakistanis rank among the world's top five users of Artificial Intelligence (AI) tools, securing the fourth spot among 21 nations surveyed by the Schwartz Reisman Institute.
The local hosting of data in Pakistan ensures data sovereignty to comply with national data protection and security standards. It also achieves faster response times for queries. The data center runs entirely on solar power, making it a green data center solution. Additionally, the government of Pakistan has allocated 2,000 MW of power from the national grid for AI data centers.
One of the objectives of locally hosted AI data centers is to support Urdu language models (LLMs) trained to help Pakistani consumers who wish to use AI chatbots in local languages. A number of Urdu LLMs have already been developed in the country, including Alif and UrduLlama, both based on the open-source Llama-3.1-8B-Instruct architecture. Another model named UrduGPT is described as Pakistan's own large language model, UrduGPT is fine-tuned specifically for Urdu and regional languages of Pakistan using local datasets and cultural semantics to ensure relevance for native speakers.
Currently, Pakistan has 27 data centers located in Karachi, Lahore, and Islamabad, operated by PTCL, Multinet, and Cybernet. More are being built. Zong, a local mobile phone service operator owned by China Mobile, is building AI-driven cloud infrastructure in Pakistan. Indus Cloud and Huawei have a strategic partnership that aims to launch a next-generation cloud data center, incorporating energy-efficient Huawei technology. XDS and Al Nahal IT Park are partnering to build a liquid-cooled data center at the Al Nahal IT Park in Sindh province. Mari Petroleum Company Limited (MPCL), the state-owned oil and gas firm, is diversifying by forming a subsidiary, Mari Technologies, to build Tier III and Tier IV data centers in Islamabad and Karachi, with the 5MW Islamabad facility set for completion by early 2026.
Related Links:
Pakistan to Develop Urdu Large Language Model (LLM)
Algorithm: Origins of Artificial Intelligence in Islamic Age
Digital Pakistan 2022: Broadband Penetration Soars to 90% of 15+ Population
STEM Enrollment in Pakistan Exceeds One Million
Digital Public Infrastructure in Pakistan
Generative AI Buzz in Pakistan
Is Pakistan Ready for the AI Revolution?
Growing Presence of Pakistani Women in Science and Technology
Riaz Haq's Youtube Channel

17 comments:
I'm not much acquainted with the technicalities involved in LLMs, but I suppose at a core level Urdu and Hindi LLMs can share a lot since both are essentially the same language (registers/variants of Hindustani) with a shared grammar and basic vocabulary? The difference in script shouldn't matter that much since there are already OCR tools that can easily recognize the patterns and shapes of letters.
Here’s Google AI response on this topic:
Hindi and Urdu LLMs differ primarily in their training data's script (Devanagari for Hindi, Perso-Arabic for Urdu) and vocabulary emphasis (Sanskrit for Hindi, Persian/Arabic for Urdu), though they share the same core grammar and spoken form (Hindustani), meaning an LLM trained on one can often understand the other with adjustments for script and style, but specialized models excel at capturing cultural nuances and specific linguistic features.
This video explains the core differences and similarities between Hindi and Urdu:
https://youtu.be/LbdRNCNrzAg?si=8F3-aLyihQk7zuQU
Key Differences for LLMs:
Script: Hindi uses the Devanagari script (left-to-right), while Urdu uses the Nastaliq script (right-to-left).
Vocabulary: Hindi draws more heavily from Sanskrit, while Urdu incorporates more Persian and Arabic loanwords, especially in formal contexts.
Cultural Nuance: Urdu LLMs often better capture Islamic and Mughal cultural contexts, while Hindi LLMs reflect Hindu traditions and Indian cultural elements.
How LLMs Handle Them:
Shared Foundation: Because they are linguistically the same spoken language (Hindustani) with identical grammar, base models trained on large multilingual datasets often perform well on both, especially when romanized.
Specialization: For high-fidelity outputs, specialized models are trained on distinct corpora:
Hindi LLMs: Focus on Devanagari text, Sanskritized vocabulary, and Indian cultural references (e.g., Indian news, literature, Bollywood).
Urdu LLMs: Focus on Nastaliq script, Persian/Arabic vocabulary, and Pakistani/Muslim cultural contexts (e.g., poetry, Islamic texts, Pakistani media).
Romanization: Distinguishing between romanized Hindi and Urdu in text is challenging but possible for LLMs, as shown in research, by recognizing subtle lexical and stylistic markers.
"Urdu LLMs often better capture Islamic and Mughal cultural contexts, while Hindi LLMs reflect Hindu traditions and Indian cultural elements."
Not sure about the vernacular Urdu spoken in Pakistan, but the vernacular Hindi spoken in the streets of north India isn't that much Sanskritized and makes considerable usage of Persian/Arabic nouns (like "Bollywood Hindi"). Sanskritized Hindi is largely confined to official communication and Hindu religious contexts. There might be regional variations in this, however. Its kind of similar to how my native language Malayalam often swaps usage of Sanskrit-origin words and native Malayalam/Dravidian-origin words to refer to the same thing depending on the context and the choice of the speaker/writer. Since the underlying grammar and core vocabulary (especially verbs) remain the same and only the choice of some nouns differ they wouldn't be treated as different languages and can be handled by the same LLM. I guess aside from the handling of Nashtaliq and Devanagari scripts both Hindi and Urdu can be handled similarly by LLMs as well.
"Urdu LLMs often better capture Islamic and Mughal cultural contexts, while Hindi LLMs reflect Hindu traditions and Indian cultural elements."
Not sure about the level of "Persianization" in the vernacular Urdu spoken in Pakistan, but the vernacular Hindi spoken in the streets of north India isn't that much "Sanskritized" and makes considerable usage of Persian/Arabic nouns (like "Bollywood Hindi"). Sanskritized Hindi is largely confined to official communication and Hindu religious contexts. There might be regional variations in this, however. Its kind of similar to how my native language Malayalam often swaps usage of Sanskrit-origin words and native Malayalam/Dravidian-origin words to refer to the same thing depending on the context and the choice of the speaker/writer. Since the underlying grammar and core vocabulary (especially verbs) remain the same and only the choice of some nouns differ they wouldn't be treated as different languages and can be handled by the same LLM. I guess aside from the handling of Nashtaliq and Devanagari scripts both Hindi and Urdu can be handled similarly by LLMs as well.
(Comment edit: Added "Persianization" in vernacular Urdu for clarification)
American multinational technology company Meta has officially launched Meta AI in Urdu for users in Pakistan, intending to accelerate the process of digital transformation in the country, a press release from the company said on Monday.
https://www.dawn.com/news/1951569
The announcement was made during an event titled “Future in Focus: AI and Innovation”, held in collaboration with the Ministry of Information Technology and Telecommunication (MoITT), the release said. Meta also announced the start of an experimental programme for AI education and government digital transformation in Pakistan.
“These initiatives aim to accelerate the process of digital transformation in the country,” the release said. “Pakistani users will now be able to interact with Meta AI not only in English but also in Urdu.”
Speaking at the event, Federal Minister for Information Technology and Telecommunication Shaza Fatima Khawaja said, “Under the Prime Minister’s Digital Nation Vision, Pakistan is moving toward a future where technology empowers every citizen.
“Our partnership with Meta reflects this commitment, promoting AI education, digital transformation, and innovation within government and educational institutions. The inclusion of Urdu in Meta AI marks a milestone that makes technology more inclusive and accessible, ensuring no one is left behind in this digital transformation journey,” she said.
Meta’s Director of Public Policy for South and Central Asia, Sarim Aziz, added: “We aim to support public sector and educational institutions in driving digital transformation through effective use of AI. We are also delighted that Meta AI is now available in Urdu, giving the local community new opportunities to connect with technology in their own language.”
According to the release, Meta has also introduced a localised edition of the guide “Transforming Public Sector Innovation in Asia Pacific with Llama”, developed in collaboration with Deloitte.
Prepared with the support of the ministry, the guide explains how Meta’s open-source AI model, Llama, can enhance government operations, improve public services, and strengthen data sovereignty. The document highlights best practices and successful examples from various Asia-Pacific countries, including Pakistan.
Similarly, Meta said it has launched the AI Literacy Programme in partnership with the Higher Education Commission (HEC), National Computing Education Accreditation Council (NCEAC), MoITT, and atomcamp.
“Under this programme, 350 non-computer science university teachers across Pakistan will be trained in basic AI skills so they can prepare students to meet the demands of the modern digital era,” the release added.
Meta also announced the Government Digital Transformation Xperience (GDTX) 2025 program, which aims to provide Pakistan’s public institutions with Meta’s technologies, solutions, and best practices. The programme, according to Meta, will bring together experts from the public and private sectors to exchange strategies and experiences for effective digital transformation.
Where are they getting energy for the data center?
Solar energy boom has resulted in surplus electricity in Pakistan’s national grid.
Pakistan generates about 10GW of surplus electricity, much of which is wasted due to underutilization. This surplus energy costs the government millions in unspent power obligations.
Reports estimate that the country’s surplus power capacity could range from 10,000 to 15,000 MW, depending on seasonal demand and system limitations.
While most of this power comes from fossil fuels, nearly 40% is derived from renewable sources, including wind, hydro and solar. https://www.ccn.com/news/crypto/pakistans-10gw-surplus-power-fuel-bitcoin-mining/
https://www.ccn.com/news/crypto/pakistans-10gw-surplus-power-fuel-bitcoin-mining/
"but I suppose at a core level Urdu and Hindi LLMs can share a lot".
... and why would we do that?
G. Ali
@G Ali,
- "..and why would we do that?"
I did not ask anyone to "do" anything here. As someone with an interest in the history and development of languages, I was only echoing a thought here that since "Urdu" and "Hindi" are essentially the same language underneath, LLMs that are developed to handle them could possibly share many things (learnings?). That said, like I mentioned in my comment I have little idea how LLMs actually work, and I do not know the level of feasibility of such a "collaboration" of Hindi and Urdu LLMs.
And whom did you mean by "we" here? Pakistan? If so, please bear in mind that as a language that originated in what is now western Uttar Pradesh, Urdu/Hindi/Hindustani is a shared heritage of the subcontinent. Several Indian states have accorded co-official status to "Urdu" written in Nashtaliq script and it is also one of the 22 "official" languages of India as per the Eighth schedule of the Constitution. In fact, the Indian Supreme Court too had clarified the "Indian-ness" of Urdu in a recent judgement.
https://www.ndtv.com/india-news/supreme-court-urdu-marathi-language-not-religion-top-court-rejects-plea-against-urdu-on-signboard-8175541
Of course, there may be some in modern India who want to paint "Urdu" as a foreign language of a particular religious group (and leave it to rot), and there may be those across the border who essentially lend support to the same nonsense, but that's another matter.
There you go again with irrelevant rambling. So, tldr.
Btw, a while back I had the opportunity to spend some time with an urdu poet from India, not really very flattering of how urdu is treated in India.
@Zamir, I assume you are the "G Ali" posting under a different ID. ("Irrelevant rambling" and "tldr" struck familiar anyways.)
I didn't say the treatment of "Urdu" in India is flattering. In fact, that's what I hinted in my last statement above when I wrote that there are some people on both sides of the border who wants to paint an exclusive religious identity to it. But the fact remains that as part of the "Hindustani" continuum it is an "Indian" (or by linguistic classification, "Indo-Aryan") language and therefore has the same official status as other languages of India.
This week, tech giants Amazon and Microsoft pledged an eye-popping $50bn-plus combined investment in India, putting artificial intelligence (AI) in the spotlight.
https://www.bbc.com/news/articles/cd74gjw1j11o
As fears of an AI bubble swept global markets and tech stock valuations soared, several leading brokerages took a contrarian view on India's AI landscape.
Christopher Wood of Jefferies said the country's stocks were a "reverse AI trade". That basically means India should outperform other markets in the world "if the AI trade suddenly unwinds" - or simply put, the global bubble bursts.
HSBC also held a similar view, saying Indian equities offered a "hedge and diversification" for those uneasy with the ongoing AI rally.
This comes as Mumbai stocks have lagged behind their Asian peers over the past year, with foreign investors moving billions into Korean and Taiwanese AI-driven tech companies in the absence of comparable opportunities in India.
In this backdrop, the Amazon and Microsoft investments provide a much-needed fillip - yet it remains worth asking where India truly stands in the global AI race.
There are no easy answers.
The adoption of AI in India has been rapid. Investments into some parts of the value chain - such as data centres, the physical backbone of AI, or chip-making facilities - have begun trickling in. Just this week, American chipmaker Intel announced a collaboration with Mumbai-based Tata Electronics to manufacture chips locally.
But when it comes to a sovereign AI model, it appears India is continuing to play catch-up.
About a year-and-a-half ago, the Indian government launched an AI mission through which it began supplying start-ups, universities and researchers with high-end computing chips to develop a large homegrown AI model like OpenAI or China's DeepSeek.
According to the federal electronics ministry, the launch of the sovereign model - which supports more than 22 languages - is imminent. In the interim though, the likes of DeepSeek and OpenAI have made further advances, launching newer variants.
While the government has recognised the need to reduce over-dependence on foreign platforms because of the risk of surveillance and sanctions, India's $1.25bn sovereign mission is a shadow of France's $117bn or Saudi Arabia's $100bn programmes.
The country's ambitions also face numerous other hurdles - from semiconductor availability to skilled talent and fragmented data ecosystems, according to global consultancy EY.
India currently lacks enough computational infrastructure or the billions of dollars of research and development (R&D) investment made over decades that gave China and the US a distinct leg up.
Despite its global strength in AI talent, India struggles to keep its developers at home.
"The current tightening of overseas work visas provides India a window of opportunity to retain domestic talent and attract Indian-origin talent at home. However, given that top-tier AI talent is mobile globally, attractive policy incentives need to be put in place to incentivise relocation to India," the EY report says.
China, for example, offers a range of incentives such as "financial support and subsidies, tax incentives and funding for research and development, special talent visas and fast-track immigration", the report says.
India has a much higher concentration of AI-skilled professionals than the global average - specifically, 2.5 times more. Policies that retain this talent are not yet in place.
After 11.5 Years of Officially Trying to Redefine Indian Culture, We Have FA9LA
Seema Chishti
https://thewire.in/culture/after-11-5-years-of-officially-trying-to-redefine-indian-culture-we-have-fa9la
Jashn-e-Rekhta (Rekhta, meaning “mixed”), the Urdu festival organised since 2015 on an industrial scale) was the biggest open-air event in the capital between December 5 and 7. Approximately 120,000 people came and inhaled Delhi’s noxious air to be a part of the live vibe. Audiences were visibly transfixed by the sounds and draw they felt towards Urdu. Jashn-e-Rekhta gets panned in some circles for not being political enough or focussing on just the foods, lyricism and ‘beauty’ surrounding a great Indian language, currently in an existential crisis. But we have come to a point when howsoever unintended, even signalling surviving and thriving is a political act.
Scholar and public intellectual Alok Rai – also Premchand’s grandson – was in the capital speaking on Urdu’s future. He spoke on how despite mounting government pressure to keep Urdu out of the room, it is unable to make it go away. Its popularity, in his view, stems from two important places; first, it having developed from an engagement with several languages and territories in India, and being spoken in India over centuries, meant the sounds and tones have been sandpapered and polished, rendered almost mellifluous to the human ear. Second, Bombay Hindi cinema music serves as the basic emotional landscape of those parts of India familiar with Hindi. That sensibility has almost intravenously fed Hindustani and Urdu into our psyche.
History suggests such purification drives backfire. Could we be seeing a replay of how misplaced ideas of India’s first I&B minister B.V. Keskar for a ‘pure’ All India Radio (AIR), banning them from playing impure Hindi film songs (in Hindustani) served as rocket-fuel for the popularity of both Hindi film music and the derided harmonium? It eventually forced AIR to acquiesce and start a station, Vividh Bharti, that would play Hindi film music. This is a point very effectively made by Isabel Huacuja Alonso in Radio for the Millions, Hindi-Urdu Broadcasting Across Borders.
At Alok Rai’s public lecture on Sunday, former Culture Secretary Ashok Vajpayee recounted that in Ujjain during Mahakal this year, he was visiting after it was no longer Shahi Sawari, but Rajsi Sawari, as de-Urdufication was on. He addressed the locals in a gathering and told them he was very puzzled, “bhai, sawari bhi to Urdu hai (even sawaari is an Urdu word)”.
A drive towards ‘One-Culture’, being about only ‘one’ imagined, Brahminical Hindutva-laden variety, is unable to really get around the multiple strands of what makes the Indian weave. Arabic rap, qawwali beats, the dots, dashes and accents that make for a happy “mixed” and mixed-up confluence remain the top notes of all that is Indian.
Vineeth, you obviously don't know either hindi or urdu. So let me enlighten you. When a person speaks in Hindi, even if he is college professor, he sounds uneducated (Amitabh Bachan is an exception to this rule).
Urdu, on the other hand, is extremely polished, polite and sophisticated. Even a street vendor, when he speaks proper urdu, sounds like a college professor.
G. Ali
Attempts to "purify" any language by removal of other linguistic influences that dates back centuries is an absurdity. Just as "Urdu" cannot be "purified" out of its Prakrit base and grammar, "Hindi" cannot be "purified" out of its Persian/Arabic loanwords as well. In fact, as a hobby I started keeping a list of all the Persian/Arabic loanwords I could find in common "Hindi" speech, adding to the list whenever I came across a new one. Even with my limited grasp of "Hindi", the list was growing huge! It made me wonder where one would draw the line between "Hindi" and "Urdu" here.
When Modi govt celebrated 75 years of Indian independence as "Azadi ka Amrit Mahotsav", did they realize "Azadi" is a Persian loanword? Or when Modi supporters shout "Modi hai to Mumkin hai", do they realize "Mumkin" is Arabic?
Sanskritization of "Hindi" sounds as pointless an exercise as "de-Sanskritization" of Malayalam and Tamil languages that some Dravidian puritans have advocated for in our part of the country. Sanskrit vocabulary has entrenched itself deeply in Malayalam language over centuries that it is often difficult to speak even a sentence in it without resorting to the use of one or the other word of Sanskrit-origin. In many cases, the native Malayalam terms of Dravidian origin has been forgotten after long periods of disuse, and only the Sanskrit-origin terms have remained. As a result, attempts to speak a "de-Sanskritized" version of Malayalam would be even more difficult and awkward than "de-Persianized" Hindi.
I came across this piece of news yesterday. This is what the subcontinent really needs. Languages like Sanskrit, Urdu, Bengali etc are a shared heritage of the entire subcontinent. Instead of painting these languages as the property of the "other" and negletic or demonizing them, we should cherish them as our common heritage.
"In a First Since Independence, A Pakistan University Is Teaching Sanskrit"
https://m.thewire.in/article/south-asia/in-a-first-since-independence-a-pakistan-university-is-teaching-sanskrit
@G Ali,
I acknowledge that I'm not a native speaker of Hindi/Urdu/Hindustani and my limited acquaintance with "Hindi" was the few years I learned it as a third language at school. But tell me, what exactly do you mean by a person speaking "Hindi" and a person speaking "Urdu" here? I haven't heard anyone speak Standard Sanskritized Hindi here other than a few Sanskrit words that are used in religious contexts. The average "Hindi" speaker you encounter in the Indian streets use Persian/Arabic vocabulary quite liberally. And as for your clearly parochial and chauvinistic (and shall I say laughably absurd) remarks about one vernacular Hindustani variant/dialect being "uneducated", "unpolished" and "impolite" than another, I can only make an observation that your thought process likely has more in common with "sanghis" than you may realize.
Post a Comment