Data Science Intern

Jun 17, 2024

In Data Science, which LLM is the best for the job?

In Data Science, which LLM is the best for the job?

In an era where technology is rapidly advancing, the growth of AI over the past five years has made AI easily accessible and applicable in every field. Looking back just two years to the first introduction of LLMs (Large Language Models) and comparing them to today, we see continuous development and the introduction of new features. The competition between providers is intense.

In this article, BOTNOI Group will gather the most popular LLMs to see, from a Data Scientist's perspective, which tools best aid in analysis and support, ultimately making our lives easier.



This model, launched by OpenAI on May 13, 2024. The number "4" indicates the version of the model, while the letter "o" stands for "omni." As an omni model, it is capable of processing and responding to text, audio, and video inputs. Notably, its response time to voice input averages 232 milliseconds, with a typical response time of 320 milliseconds, closely matching human conversational rhythm.

What are the interesting features of ChatGPT-4o?

  1. Upgraded Voice Mode

    • Enhanced emotional tone modulation and faster response times.

    • Allows users to interrupt and ask questions while ChatGPT is speaking.

    • Can distinguish between different speakers' tones.

  2. Feature Real-Time Translation:

    • Supports Voice Mode, acting as a real-time language interpreter.

    • Can generate various tones and emotions in speech.

    • Capable of changing voices on command and even singing.

  3. Real-Time Image Reading:

    • Reads images through a mobile camera or a desktop screen in real-time.

    • Answers questions about the images instantly.

  4. Enhanced Multi-Language Support:

    • Improved functionality in 50 languages.

    • Optimized tokenization system reduces token usage by 1.1 – 4.4 times in 20 languages, saving tokens when prompts are typed.

  5. Improved Natural Language Processing (NLP):

    • Enhanced text data analysis capabilities, such as Sentiment Analysis.

    • Quickly and accurately summarizes or generates reports from large datasets.

  6. Machine Learning Model Development and Optimization:

    • Assists in selecting the appropriate model for specific tasks and fine-tuning model hyperparameters.

    • Provides explanations and interpretations of model results.

Using ChatGPT-4o in Data Science

With the recent upgrades and improvements to ChatGPT-4o, its usability has greatly increased, making it a highly valuable tool for both daily life and work-related tasks.

For those working in the data field, the ability to gain insights, modify data, remove outliers, perform K-means clustering, and more, by simply providing data and prompts, makes this tool incredibly convenient and efficient. Although the generated results still need to be verified, it significantly speeds up various tasks and enhances productivity.


Gemini 1.5 Pro

The LLM developed by Google AI was launched on May 14, 2024, in the United States (or the morning of May 15, 2024, in Thailand). This LLM, created by Google, has been trained on both text and code datasets, enabling it to perform various tasks such as text generation, language translation, question answering, and more. However, it still has some limitations and is currently under development, which may lead to potential changes in responses and occasional inaccuracies.

Interesting Features of Gemini 1.5 Pro

  • Largest Context Window: Among LLMs available to the general public, Gemini 1.5 Pro boasts the largest context window, capable of processing up to 1 million tokens, equivalent to a document approximately 1,500 pages long.

  • File Upload Capability: Users can upload various files, such as those from Google Drive or their own devices, into Gemini Advanced. This feature allows Gemini to analyze the data within those files.

  • Customizable Gems: Users can create custom versions of Gemini, known as Gems, for specific purposes. For example, a Gem can be created to help plan exercise routines or travel itineraries.

  • Integration with Gmail and Google Calendar: Gemini Advanced can access a user’s Gmail and Google Calendar, making it easier to plan trips and events according to the user's schedule without manually searching for information.

  • Enhanced Image Understanding: Gemini 1.5 Pro can better understand images, such as solving math problems from photos of equations.

  • Gemini Live: This feature allows users to interact with Gemini using voice commands. When the user’s camera is on, Gemini Live can respond to the surroundings in real-time.

Using Gemini in Data Science

Following the launch of ChatGPT-4o, Gemini 1.5 Pro was also introduced shortly thereafter. Gemini 1.5 Pro offers extensive data storage capabilities, which greatly facilitates tasks in the field of Data Science. This tool significantly reduces working time and increases accuracy, making data-related tasks easier and faster.



Developed by Microsoft, this LLM was launched on May 21, 2024. It has been trained on code datasets from various programs, making it ideal for tasks such as code suggestion, error correction, and code completion. Overall, it is a model primarily suited for programming-related tasks. Additionally, Copilot+ operates locally on your device, ensuring privacy and security.

Interesting Features of Copilot+

  • Copilot+ includes all the capabilities of the standard Copilot.

  • On-Device Recall Feature: This feature ensures that all data processing happens locally on your device, without sending information elsewhere (according to Microsoft). It stores images, screen recordings, and everything you do, using Generative AI to process and make this information searchable. This allows the AI to be fully aware of your activities.

  • Image Editing: By simply right-clicking and using a prompt, you can edit images easily, such as removing backgrounds from videos.

  • Email Notifications: When an email notification appears, the model provides suggested responses that can be sent with a single click, allowing for quick and efficient replies.

  • Cocreator: The latest feature, Cocreator, leverages Generative AI to help create art through prompts or by sketching in programs like Paint or Photoshop.

  • Application Integration: Copilot+ supports collaboration with many other applications.

  • Real-Time Language Translation: It can translate languages during video calls into English very quickly. Currently, the model supports up to 44 languages.

Using Copilot+ in Data Science

The ability of Copilot+ to monitor the screen in real-time makes it one of the best tools for Data Scientists. Unlike ChatGPT, which requires downloading and uploading files such as Excel sheets before analysis, Copilot+ allows for seamless data interaction without these extra steps. This feature can be a game-changer, significantly enhancing workflow efficiency. However, the main drawback is the necessity of having a suitable computer to utilize this feature fully.


For companies offering cloud processing services, there are well-known LLMs called “Amazon Bedrock” and “Amazon Titan.”

Amazon Bedrock is a platform service that allows customers to access and use foundational models. Amazon Titan is an LLM developed to support various applications related to natural language processing (NLP)

Interesting Features of AWS

  1. Amazon SageMaker: A comprehensive service for building, training, and deploying machine learning models. It consists of four main features:

  • SageMaker Studio: A workspace for data science with a notebook interface for writing code, analyzing data, and connecting to GitHub.

  • SageMaker Pipelines: A tool for CI/CD in ML, used for cleaning data, creating, and updating models.

  • SageMaker Training: Used for training models, evaluating algorithms, and optimizing performance.

  • SageMaker Endpoints: A tool for deploying models to be used in real-time, including auto-scaling capabilities.

  1. Amazon Rekognition: A computer vision service with various features:

  • Face Liveness: Detects real users and prevents spoofing within seconds during face verification.

  • Custom Labels: Detects custom objects, such as brand logos, using Automated Machine Learning (AutoML) to train your models with images.

  • Video Segment Detection: Identifies key segments in videos like black frames, opening/closing credits, slates, color bars, or shots.

  • Labels: Detects objects, scenes, activities, landmarks, dominant colors, and image quality.

  1. Amazon Lex: A service for creating text or voice chatbots using the same technology as Alexa. It includes a wide range of open-source AI and deep learning tools and frameworks like Amazon SageMaker, TensorFlow, Apache MXNet, and PyTorch.

  • Developing Chatbots: Creates chatbots that understand conversational context and work in multiple languages.

  • Customize Bot Capabilities: Designs and deploys multi-channel AI applications with ease.

  1. Amazon Comprehend: A service that extracts insights from data using machine learning models with Natural Language Processing (NLP) to understand human language.

  • Topic Modeling: Analyzes and identifies main topics in large documents or texts using topic modeling techniques.

  • Entity Recognition: Identifies and categorizes key entities in text, such as names, places, organizations, and dates, to provide business insights.

  1. Amazon Forecast: A service for easily and accurately predicting business outcomes using machine learning.

  • Predictive Accuracy: Creates forecasts based on time-series data to evaluate future trends using advanced machine learning techniques.

  • Custom Models: Customizes models to meet business needs.

  • Scalable Forecast: Manages large datasets and multiple time-series data.

  1. Amazon Personalize: A service for creating recommendation applications using machine learning, similar to the recommendation system used on Amazon.com.

  • Real-time Recommendations: Provides high-quality recommendations in real-time based on changes in customer behavior.

  • User Personalization: Quickly offers tailored recommendations for individual users.

  1. Amazon Polly: A service that converts text into speech in multiple languages, allowing for the creation of voice-enabled applications.

  • Support Numerous Languages: Offers many languages through TTS and NTTS technologies, ensuring smooth text-to-speech conversion.

  • Generate Human Pattern: Customizes and controls speech to closely mimic human behavior.

  1. Amazon Textract: A service that automatically extracts data from scanned documents, handwritten texts, images, or PDFs using machine learning.

  • Optical Character Recognition: Superior learning capability for optical character recognition.

  • Document Classification: Ability to categorize and format documents.

Using AWS in Data Science

From the details gathered about the models, it is evident that AWS's AI capabilities are highly versatile, supporting a wide range of tasks from data science to designing data pipelines and building models for analysis. The input data can be in the form of images or text. Most importantly, the data used in AWS is secured and maintains enterprise-level privacy.

Evaluating the Performance of LLMs: Which One Works Best?

The performance and responsiveness of an LLM (Large Language Model) are often assessed through various criteria. Here, we will use six performance benchmarks to evaluate their effectiveness:

  1. Massive Multitask Language Understanding (MMLU): A benchmark for measuring a model’s understanding and capability across various subjects, including mathematics, history, computer science, law, and more. The model must possess extensive knowledge and problem-solving abilities related to diverse world knowledge.

  2. Graduate-Level Google-Proof Q&A (GPQA): These are complex academic questions requiring deep analytical thinking and understanding of the subject. Typically written by experts in fields such as biology, physics, and chemistry, these questions are so challenging that even experts or PhD candidates in the relevant fields can only answer about 74% of them correctly.

  3. MATH: Consists of middle school and high school level math problems.

  4. HumanEval: Tests the accuracy, completeness, and naturalness of the model's responses, including grammatical correctness, to determine how closely they resemble human responses.

  5. Multilingual Grade School Math (MSGM): Consists of elementary school-level math problems translated into multiple languages, including less commonly spoken ones, to evaluate the model's cross-language capabilities and adaptability.

  6. Discrete Reasoning Over Paragraphs (DROP): Involves questions that require understanding and processing entire paragraphs or long texts to verify if the AI system can accurately extract and use the information from those texts to provide correct answers.

Here is the comparison based on the six performance benchmarks, scored from 0 to 100. Note that there is no data for Gemini Pro 1.5 for the GPQA benchmark.
  • GPT-4o consistently scores the highest across all benchmarks, showcasing its superior versatility and capability in handling a wide range of tasks.

  • GPT-4 Turbo and Claude 3 Opus follow closely, with strong performances across most benchmarks.

  • Gemini Pro 1.5, while slightly behind in some areas, still demonstrates robust performance, particularly in MATH and HumanEval, despite the lack of data for GPQA.

Comparison Table of Essential Data Scientist Tasks with Usability in Each LLM Model


After reviewing the data for all four AI models, it is concluded that Chat GPT-4o should be used for data science tasks. This is due to its superior performance in mathematical processing and other academic areas, as Chat GPT-4o excels in every aspect thanks to the diversity of its datasets. Additionally, in pattern detection or "Needle in Haystack" testing, Chat GPT-4o demonstrates 100% accuracy, equaling or surpassing other AI models. Chat GPT-4o's notable new feature is its improved NLP processing, and it also excels in selecting the appropriate ML model for the dataset.

This article was composed and researched by these authors:

  • Pitchaya Rueangsakpakdee (Focus)

  • Wish Chamnansua (Daniel)

  • Kittipong Charoenpong (Nam)

  • Artorn Damnoenudomkan (Jen)

  • Teerapat Sangkrajang (Tee)

  • Sorawit Singpraphai (Future)

  • Phuvanach Phoemphol (Arm)

  • Jirachaya Nongbualang (Erng)

  • Vasavat Limnanthasin (Gram)

  • Witchapon Kasettakarn (Game)


Collaborate to Innovate

Together, We Build the Future.

Collaborate to Innovate

Together, We Build the Future.