
Senior Staff Data Scientist, Catalog
- 서울시
- 정규직
- 풀타임
- Lead the transition of the Catalog Data Analytics team from descriptive reporting to predictive modeling, causal inference, and robust experimentation.
- Develop and deploy ML models for catalog health scoring, anomaly detection, and data quality prediction across millions of SKUs.
- Apply advanced techniques (e.g., transformers, foundation models, Bayesian inference) to improve structured/semi-structured product data accuracy and completeness.
- Strategic Impact Projects
- Tracking Catalog Health: Design scientific frameworks and metrics to monitor catalog quality. Build predictive systems that proactively detect emerging data issues or content decay.
- Automating SOPs: Use data science, ML, and LLMs to automate repetitive catalog management processes and reduce manual errors in data curation and enrichment.
- Cross-Functional Collaboration
- Partner with the Backend Engineering team to build scalable, production-grade pipelines and integrate modeling solutions into core catalog services.
- Collaborate with Product Management and Business Leadership to define problem statements, prioritize initiatives, and ensure measurable business impact.
- Work closely with Data Analysts to elevate the team's analytical rigor, guide experimental design, and support training in advanced scientific approaches.
- Mentorship and Culture Building
- Mentor data scientists and analysts to deepen the team's scientific bench strength.
- Advocate for a culture of rigorous testing, peer review, and continuous learning.
- MS or PhD in Computer Science, Statistics, Applied Mathematics, or a related quantitative field.
- 8+ years of experience in data science or applied research roles, preferably in high-scale technology or e-commerce environments.
- Demonstrated expertise in building predictive models, causal inference, and A/B testing at scale.
- Strong proficiency in Python and SQL; experience with large-scale data tools (e.g., Spark, Airflow) and ML frameworks (e.g., TensorFlow, PyTorch).
- Proven experience working closely with backend engineering and product teams to deploy solutions in production environments.
- Experience with structured product data, taxonomy systems, or catalog health in e-commerce or logistics.
- Background in applying LLMs or NLP to structured/semi-structured data environments.
- Strong ability to translate business problems into scientific questions and communicate findings to senior leadership.
- Prior experience in upskilling analytical teams or building scientific rigor in existing data organizations.