Showing posts with label Generative AI. Show all posts
Showing posts with label Generative AI. Show all posts

Thursday, November 7, 2024

Pakistan to Develop Urdu LLM for Generative AI

National University of Science and Technology (NUST), National Information Technology Board (NITB) and Telecom network operator Jazz have signed a Memorandum of Understanding (MOU) to develop Pakistan’s first indigenous Large Language Model (LLM) with focus on Urdu, including datasets for Pashto and Punjabi languages. It is aimed at empowering individuals, businesses, and organizations with advanced AI tools in their native languages. The envisioned LLM is expected to drive innovation in Generative AI applications, boosting productivity and accessibility in critical sectors like healthcare, education, and agriculture.

GPT-4 Accuracy Scores. Source: The Economist


Generative AI tools such as ChatGPT are powered by large language models, or LLMs. These models need to be trained on vast amounts of data in specific languages to be useful. Unfortunately, the Urdu content of the Internet is less than 0.1%. This will present a challenge for the developers of Urdu LLMs.

Online Content of Various Languages. Source: W3Techs 


Lack of Urdu content available for training ChatGPT affects the accuracy of the results for Urdu language users. For example, the GPT-4 accuracy score in question-answer tests in Urdu is just over 70%, compared with 85% accuracy score in the English language, according to data from OpenAI. Other South Asian languages, including Hindi, Bengali, Punjabi, Marathi and Telugu, suffer from the same problem. 

It's not just a South Asian problem. These challenges exist in the developing world. Non-European languages are generally poorly represented online. It's a major obstacle for non-European nations in developing their own generative artificial-intelligence (AI) models, which rely on vast amounts of training data. Generative artificial intelligence (AI) can produce biased results due to a number of factors, including the data it's trained on, the algorithms used, and how it's deployed. 

The use of AI in developing nations such as Pakistan will remain limited to a small number of people proficient in the use of the English language. Broadening the adoption of AI applications will require LLMs trained on local language content. The absence of this development could cost Pakistan the opportunity to take full advantage of the AI Revolution


Friday, August 4, 2023

OpenAI ChatGPT: Generative AI Buzz in Pakistan

A Singapore-based cybersecurity firm Group-IB discovered in June that over 100,000 ChatGPT user accounts were compromised and their credentials found on the Dark Web. Among the accounts reported compromised, India topped with 12,632, followed by Pakistan with 9,217 and Brazil with 6,531. Bangladesh witnessed the fewest instances with 2,463. This report gave a glimpse of the high interest level of Indians and Pakistanis in generative AI.  Another report attributed to Similarweb, which tracks popularity of websites by number of visitors, ranked ChatGPT in Pakistan at number 7, ahead of Instagram, Twitter and TikTok. Globally ChatGPT website is ranked 17th. Prior to this, there was a series of news reports about the launch of Presidential Initiative for Artificial Intelligence and Computing (PIAIC) by President Arif Alvi, and then came the government's policy to train one million AI experts in the country by 2027. Pakistanis published 2,600 AI-related research papers from 2016 to 2020, according to Statista

Top 10 Countries by Number of ChatGPT Accounts Compromised. Source: Group IB


Back in 2017, then Prime Minister Shahid Khaqan Abbasi inaugurated a National Centre for Artificial Intelligence (NCAI) at the National University of Sciences & Technology (NUST) in Islamabad. It was followed by a Rs 1.1 billion budgetary allocation for select universities with AI research to be coordinated by NCAI. In 2020, Pakistan Air Force (PAF) set up a Center of Artificial Intelligence and Computing (CENTAIC). 

While OpenAI is the first to offer a Generative AI model trained on vast amounts of data, Google has also joined the generative AI race with its own offering. Google BARD appears to have capabilities similar to OpenAI's ChatGPT. Very little is known about the specific datasets used for training either of them, raising some trust issues about the results produced by them. 

Training/Using Generative AI Foundation Models. Source: Analytics Vidhya

Top global cloud operators Amazon, Google and Microsoft are now offering generative AI services to their clients for an additional fee. Cloud apps developers in Pakistan and elsewhere can train these base  models on their custom datasets to develop AI applications for agriculture, business, education, finance, healthcare, law etc. The AI market in Pakistan is currently estimated at $123 million by Statista Market Insights

Amazon Web Services (AWS) recently featured four Pakistani startups at the forefront of AI/ML: SalesFlo, Ozoned Digital, XpertFlow and Trukkr.  SalesFlo offers sales software for FMCG (fast moving consumer goods) companies.  Ozoned Digital caters to the technology needs of the insurance industry.  XpertFlow is an AI-powered preventative healthcare company.  Trukkr provides financial services and technology for logistics.  These and other startups are well positioned to take advantage of the new generative AI services being offered by cloud vendors. 

Related Links: