
Contract role
Data Scientist ML Architect – Onsite
SFO, CA
6-12 Months
Data Scientist GenAI & Intelligent Automation
Role Summary
We are seeking a Data Scientist to design, build, and operationalize ML, GenAI, and predictive models that power an enterprise scale AI driven Service Planning & Design (SP&D) platform. The role focuses on cost estimation, calibration, compliance intelligence, and document/image interpretation, working closely with GenAI agents, cloud architects, and domain SMEs.
Key Responsibilities
Machine Learning & Predictive Modeling
Design and develop ML models
Build and tune models using XGBoost, Random Forest, scikit learn, and related frameworks
Evaluate model performance using MAE, RMSE, R, and error distribution analysis
GenAI & Agent Driven AI
Collaborate with AI Engineers to embed ML models into GenAI driven, multi agent workflows
Work with RAG pipelines for document intelligence and contextual Q&A
Enable human readable explanations for predictions and recommendations
Data Engineering & Feature Development
Analyze structured and unstructured datasets from historical estimates, actual costs, documents, and images
Perform feature engineering from:
SAP/EES data
Historical project attributes
Regulatory and standards documentation
Ensure data quality, normalization, and anomaly detection
Image & Non Text Analytics (Preferred)
Support AI image analysis use cases:
Classification and attribute extraction from site photos and drawings
Compliance signals against engineering standards
Collaborate on pipelines using computer vision outputs and ML inference
MLOps & Model Lifecycle
Support model training, validation, and runtime invocation within cloud native platforms
Work with DevOps and AI teams on:
Model versioning and reproducibility
Monitoring for drift, bias, and performance degradation
Provide inputs for MLOps / LLMOps pipelines and governance dashboards
Required Skills & Experience
Strong foundation in Data Science, Machine Learning, and Statistics
Hands on experience with:
Python
scikit learn, XGBoost
Data analysis libraries (NumPy, Pandas)
Experience building regression and calibration models
Strong understanding of model evaluation metrics
Experience working with large, complex enterprise datasets
Ability to explain model outputs in business friendly language
Preferred Skills
Exposure to GenAI / LLM enabled systems
Experience with RAG pipelines and vector search concepts
Familiarity with computer vision outputs and non text data analysis
Experience in utilities, infrastructure, or regulated industries
Understanding of AI governance, explainability, and auditability
To apply for this job email your details to reyansh@qualis1inc.com