Location: NYC – must be in the office 4 days per week (Local candidate only)
US staffing In-Person Interview (same day offer)
Interview process – Still need to find out but I do think there will be an on-site interview for the final step, so candidates must be ok with that.
I only want your top 3 candidates at most, so send me your best 3, not your first 3. I will not consider any candidates beyond that. Please do not send resumes that are tailored exactly to the job description – that is a red flag and I will not want to get them set up. Please also send LinkedIn profiles – the profile should include a picture. Here is the job description:
Job Summary
We are seeking a highly skilled Senior Developer to lead the development of a Python-based platform that ingests internal data sources (e.g., Hadoop, REST APIs) and applies locally deployed language models (LLMs) for text analysis, including issue classification and summarization. This role requires a blend of strong engineering expertise in building scalable data systems and good communication skills.
Key Responsibilities
Design & Development
• Architect and implement a robust data ingestion pipeline using Python.
• Integrate with Hadoop and/or internal APIs for sourcing structured and unstructured data.
• Design modular components for data transformation, enrichment, and routing to downstream NLP models.
LLM Integration
• Incorporate local LLM models for classification and summarization tasks.
• Use prompt orchestration, chaining, and context-aware techniques to improve NLP accuracy and consistency.
• Ensure performance and stability of LLM-based components in production environments.
Collaboration & Engineering Practices
• Work closely with data engineers, product owners, and ML researchers to refine use cases and deliver high-quality solutions.
• Follow modern software engineering best practices including testing, CI/CD, and code documentation.
• Participate in design reviews and knowledge-sharing sessions.
Required Qualifications
• 4+ years of professional experience in Python software development.
• Proven experience working with big data systems, particularly Hadoop, PySpark, or related technologies.
• Practical experience using LLMs, vector databases, embedding pipelines, and retrieval-augmented generation (RAG) architectures.
• Familiarity with NLP tasks such as classification, summarization, and information extraction.
• Experience building or maintaining APIs and microservices.
• Experience with Model Context Protocol (MCP) for managing prompts and contextual data across LLM applications.
Preferred Qualifications
• Familiarity with LLMOps tools and scalable inference strategies.
• Prior work with LangChain, Hugging Face Transformers, or vLLM runtime environments.
• Data scientist-related experience, such as collaborating on model training and evaluation, aligning data processing logic to ML objectives, or working on feature engineering and experimentation pipelines.
• Background in financial services, enterprise software, or regulated environments
—