Data Curation Intern
Tasks
- Assign quality tier labels
- Audit open source datasets
- Build data cleaning pipelines
- Build validation checklists
- Create metadata tagging schemas
- Curate phonetically diverse text for read speech
- Define audio metadata standards
- Detect sentence boundaries
- Document data provenance licensing and processing steps
- Fix encoding issues
- Generate quality scorecards
- Implement deduplication
- Normalize scripts
- Remove noise
- Transition text corpus to aligned speech dataset
Perks/Benefits
Skills/Tech-stack
ASR | CSV | DVC | Data Versioning | Data cleaning | Data profiling | Data provenance | Deduplication | Git LFS | Hugging Face | Hugging Face Datasets | Hugging Face Hub | JSONL | Language Processing | Metadata tagging | NLP | NLTK | Natural Language | Natural Language Processing | Pandas | Parquet | Phonetics | Python | Regex | Spacy | Speech Data | TTS | Text data
Education
N/A
Roles
Analyst Intern | Data Analyst | Data Analyst Intern | Data Curation Intern | Intern
Related jobs
-
Engineering Analyst II, Gemini and Labs INR 1000K-1500KAdversarial techniques | Automation | Classifier | Data Analysis | Fine TuningHoliday coverage | Rotating on call coverage | Weekend coverageMid-level Full TimeBengaluru, Karnataka, India10h ago
-
Analytics Solutions Associate INR 1500K-2205KAgile | Alteryx | Business Objects | Data Warehousing | Microsoft ExcelSenior-level Full TimeMumbai, Maharashtra, India11h ago
-
Engineer (India Office)-Power BI INR 1500K-2000KData Quality | Data Validation | Data Visualization | Power BI | Predictive ModelingMid-level Full TimeBangalore, Karnataka, India13h ago
-
Analytics Solutions Associate INR 1500K-2173KAgile | Alteryx | Business Objects | Data Warehousing | Database DesignSenior-level Full TimeMumbai, Maharashtra, India16h ago
-
Data Automation | Data Mining | Data Validation | Data Wrangling | ExcelMid-level Full TimeBengaluru, Karnataka, India16h ago
-
AWS | Azure | DAX | Data Cataloging | Data GovernanceCharity match program | Employee assistance program | Group medical insurance | Paid time off | Parental leaveSenior-level Full TimeGurgaon - Cyber Park, India21h ago
-
Power BI Developer - Assistant Manager INR 1800K-2700KDAX | Data Analysis | Data Modeling | Data Visualization | ETLMid-level Full TimeGurugram, Haryana, India21h ago
-
AKS | Azure Databricks | Azure DevOps | Azure Machine Learning | Azure Machine Learning StudioEqual employment opportunity | Flexibility programmes | Inclusive benefits | Mentorship | Wellbeing supportSenior-level Full TimeKolkata DN 57, India21h ago
-
Data Engineering - Apprentice INR 120K-180KAWS | Agile | Algorithms | Data Analysis | Data EngineeringComprehensive training | MentorshipEntry-level Apprenticeship Full TimeBangalore - AGS, India21h ago
-
AVP, Analytics – Marketing Measurement Consulting (L11) INR 500K-700KAWS | Agile | Data Visualization | Linux | Microsoft ExcelCareer advancement | Flexible work hours | Health and wellness programs | Travel as needed | Upskilling opportunitiesExecutive-level Full TimeHyderabad IN, India21h ago
-
Data Analyst INR 1100K-1680KData Quality | Data pipeline | Data pipeline automation | Descriptive Analytics | ExcelCharity match program | Employee Assistance Program (EAP) | Group medical insurance | Paid time off | Parental leaveMid-level Full TimeGurgaon - Cyber Park, India21h ago
-
Data Analyst INR 1019K-1593KData Quality | Data Visualization | Excel | Python | Relational databasesHybrid workMid-level Full TimeChennai - Cross Border, India21h ago
-
Senior-level Full TimeBengaluru, KA, India1d ago
-
Oracle + PySpark Data Engineer (Remote) INR 1500K-2500KData Modeling | Data Quality | Data Warehousing | Distributed Computing | ETLDocumentation support | Remote workMid-level Full TimeBengaluru, KA, India R1d ago
-
Freelancer Business Analyst for Google Cloud Platform INR 850K-1500KBigQuery | Dashboarding | Data Modeling | Data Quality | Data StudioFreelance work | Remote workMid-level FreelanceBengaluru, KA, India1d ago
-
Data Analyst INR 2520K-4000KAdobe Analytics | Google Analytics | Google BigQuery | HTML | JavaScriptRemote workSenior-level Full TimeBengaluru, KA, India R1d ago
-
Data Analyst INR 1068K-2400KCode review | Dashboards | Data Modeling | Data Pipelines | Data StorytellingMid-level Full TimeBengaluru1d ago
-
REF101597D_2026252498 - Data Analyst - Python/Pandas/ Power BI - 2 to 4 years experience INR 346K-500KDAX | Data Modeling | Data Transformation | Matplotlib | NumPyRotational shiftsEntry-level Full TimePune, MH, India1d ago
-
Computer Vision | Deep learning | Image Processing | Machine Learning | PyTorchInclusive work environment | Personal development | Professional developmentEntry-level Full Time InternshipIN Bangalore Sattva Knowledge Court Bdg …1d ago
-
Data Analyst INR 1200K-2000KDAX | DAX Measures | Data Governance | Data Modeling | Data WarehousingMid-level Full TimeMumbai Central Avenue, India1d ago
-
CCAR | Cause analysis | DFAST | Data Analysis | Data QualitySenior-level Full TimeIND - KA - Bangalore - …1d ago
-
Lead Analyst (Tableau and Technical Report) - 8 + Years - Pune/Bangalore/Indore- UK Shift INR 3000K-3926KAutomation | Data Analysis | ERP | Powershell | PythonSenior-level Full TimeIND INDO 6TH FL, India1d ago
-
Data Analyst - Supply Chain (IN) INR 450K-980KDatabricks | Forecasting | Hadoop MapReduce | Lean | Machine LearningMid-level Full TimeAPAC - India - Pune - …1d ago
-
(IND) Senior, Data Analyst INR 2520K-3380KHive | Looker | Machine Learning | NumPy | PandasHealth benefits | Maternity leave | PTO | Parental leaveSenior-level Full TimeIN KA BANGALORE Home Office PTPP1, …1d ago
-
Associate Specialist Data Analysis INR 1161K-2202KAWS Glue | Amazon Web Services | Azure | Cloud infrastructure | Cloud platformHybrid work arrangementsMid-level Full TimeIND - Maharashtra - Pune (Wework), …1d ago