Hi! I am Farhan Siddiqui, an accomplished Data Scientist and Full Stack Developer specializing in Artificial Intelligence (AI) and Machine Learning (ML). With over five years of experience working with esteemed organizations and providing consultancy services, I bring a wealth of knowledge to the table.
My educational background includes a Master's in Data Science, an MBA in Logistics and Supply Chain Management, and a Bachelor's in Computer Science.
In the realm of Data Science, Deep Learning, and Machine Learning, my expertise extends to specialized areas such as Advanced Natural Language Processing, Computer Vision, and Time Series Analysis. What sets me apart from other Data Scientists is my ability to deliver practical prototypes or complete full-stack ML/data/AI web applications tailored to your specific needs.
In the domain of Advanced Natural Language Processing, I excel not only in conventional NLP techniques but also in handling state-of-the-art transformers, encompassing both encoders (e.g., BERT, BART, T5) and decoders (e.g., GPT, GPT3.5, GPT4). My proficiency extends to utilizing powerful libraries like Huggingface and employing embeddings and vector databases such as Pinecone and Faiss to optimize the operations of Large Language Models (LLMs). I have also harnessed the capabilities of Langchain to seamlessly integrate various components for end-to-end Generative AI applications, including LLMs like OpenAI GPT4, Llama, and other enterprise and open-source models, Agents, Memory, Prompts, and more. Some of the recent generative AI projects that I did include:
• LangChain LLM-based Chatbot for country-specific legal documents stored in a vector DB.
• LangChain GPT-powered Chatbot web app that answers queries based on extensive medical books stored in a vector DB.
• AI agent that scrapes knowledge from web sources, generating descriptive, prescriptive, and predictive reports based on a given keyword for marketing analytics.
• AI agent that retrieves relevant research documents from PubMed using semantic search, offering the latest insights on disease cures.
• AI agent that fetches news related to given stock tickers, providing comprehensive reports for informed financial decision-making.
• Chatbot that responds to customer queries based on enterprise guidelines, etc.
In the realm of Computer Vision, my expertise covers a wide spectrum, including Generative Adversarial Networks (GANs), Convolutional Neural Networks (CNNs), Data Augmentation, Transfer Learning (e.g., VGG1619, MobileNet, InceptionV3, ResNet50, GoogleNet, AlexNet), Object Detection (e.g., YOLOV8, YOLO NAS, YOLOV5, SSD, RCNN), and more. I am proficient in utilizing frameworks such as TensorFlow, Keras, and PyTorch. Additionally, I have hands-on experience with tools like MediaPipe, Dlib, OpenCV, HuggingFace, Roboflow, and various open-source libraries and models available on GitHub.
In the domain of Time Series Analysis and Forecasting, I am well-versed in both traditional methods (e.g., AR, MA, ARIMA, ARMA, ARCH, GARCH) and cutting-edge approaches (e.g., LSTM, GRU, RNN, 1D-CNN, Transformers, Prophet, Hybrid ARIMA-LSTM).
Some other tasks that I can handle include:
• Cloud computing on AWS, including AWS SageMaker.
• Creating dynamic and insightful dashboards using Power BI, Google Data Studio, Tableau, MS Excel, and Google Sheets to visualize business performance, leveraging my MBA background to understand marketing, financial, and business dynamics.
• Proficiency in Data Engineering, employing a mix of modern and conventional tools for Feature Engineering and Extraction to enhance Machine Learning and Deep Learning algorithms.
• Web scraping using Scrapy and Python to gather data from legal sources.
• Crafting and in querying SQL and noSQl (such as MongoDB) databases.
I leverage a range of software tools, including but not limited to Advanced Excel, PowerBI, Tableau, Google Data Studio, Rapid Miner, and AWS SageMaker.