Muhammad Danial Khilji
Data Scientist | Machine Learning Engineer
Google Certified Generative AI Leader
RSS Certified Professional Data Scientist and Data Analyst
About
I’m a Data Scientist at WPP, specializing in knowledge representation, graph-based systems, and applied machine learning. I design and productionize semantic systems for audience and identity mapping, using embeddings, probabilistic matching, and graph-based approaches. My work on the Audience Translator platform has enabled 10K+ users to map taxonomies to platforms like Meta and Google. I build end-to-end solutions combining LLMs, retrieval (RAG), AI-Agents, and structured data, including taxonomy enrichment, automated validation workflows, and scalable Python pipelines. More recently, I’ve focused on AI agent systems, developing a LangGraph-based multi-agent framework to automate ad-tech platform research, extracting API insights, audience taxonomies, and reach estimates. I’m particularly interested in building intelligent systems at the intersection of structured knowledge, retrieval, and AI agents.
Skills
Domains & Expertise
Technologies & Tools
Publications
Features matching using natural language processing
International Journal on Cybernetics & Informatics (IJCI) - Mar 24, 2023
The feature matching is a basic step in matching different datasets. This article shows a new hybrid model of a pretrained Natural Language Processing (NLP) based model called BERT used in parallel with a statistical model based on Jaccard similarity to measure the similarity between list of features from two different datasets. This reduces the time required to search for correlations or manually match each feature from one dataset to another.
Analysis Photovoltaic System in Relation to Tracking and Non-Tracking System
Journal of Fundamentals of Renewable Energy and Applications - Feb 15, 2021
The increasing demand of electricity has been a great concern in recent years. The increasing demand and environmental (global warming) issues urged scientists to evolve in the field of renewable energy. Solar energy is one of the major sources of renewable energy. Electrical energy is produced by photovoltaic cells when they allow light particles to knock free electrons from atoms. The amount of electrical output produced by the system is dependent on amount of solar energy received by PV cells. To increase solar energy output, a fixed solar panel inclined towards the optimal point is usually used. The collection of solar energy is increased by using solar tracking systems i.e. single axis or dual axis, which continuously track the sun using incidence angle of sunlight. The analysis is carried out to compare the performance between tracking and non-tracking photovoltaic systems. Data of specific solar panel systems is analysed and compared with simulations and actual outputs to compute performance ratios and deduce conclusions. The average performance ratio is found out to be 0.73 for non-tracking system and 0.90 (17% more than non-tracking systems) for tracking systems. The accuracy of estimated output of a PV system can be improved by using more accurate solar irradiance data, accurate weather conditions, exact system losses and matched inverter efficiency. The efficiency of a PV system can be improved by using solar trackers, using more efficient solar panels, installing them in a less shaded area, cleaning the panels on regular intervals, and using more efficient electrical components.