Applied Scientist (L5), Selection Monitoring
Bengaluru, Karnataka, IND
Amazon.com
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...
Selection Monitoring team is responsible for making the biggest catalog on the planet even bigger. In order to drive expansion of the Amazon catalog, we use machine learning and cluster-computing technologies to process billions of products and algorithmically find products not already sold on Amazon. We work with structured, semi-structured and Visually Rich Documents using deep learning, NLP and image processing .
The role demands a high-performing and flexible candidate who can take responsibility for success of the system and drive solutions from research, prototype, design, coding and deployment. We are looking for Applied Scientists to tackle challenging problems in the areas of information Extraction, Efficient crawling at internet scale. You should have depth and breadth of knowledge in text mining, information extraction from Visually Rich Documents, semi structured data (HTML) and machine learning. You should also have programming and design skills to manipulate Semi-Structured and unstructured data and systems that work at internet scale.
You will encounter many challenges, including:
- Scale (build models to handle billions of pages), - Accuracy (extreme requirements for precision and recall)
- Speed (generate predictions for millions of new or changed pages with low latency) - Diversity (models need to work across different languages, market places and data sources)
You will help us to
- Build a scalable system which can algorithmically extract information information from world wide web
- Intelligently cluster web pages, segment and classify regions , extract relevant information and structure the data available on semi-structured web pages
- Build systems that will use existing Knowledge Base to perform open information extraction at scale from visually rich documents.
Key job responsibilities
- Use AI, NLP and advances in LLMs/SLMs to create scalable solutions for business problems
- Efficiently Crawl web, Automate extraction of relevant information from large amounts of Visually Rich Documents and optimize key processes
- Design, develop, evaluate and deploy, innovative and highly scalable ML models
- Work closely with software engineering teams to drive real-time model implementations
- Establish scalable, efficient, automated processes for large scale model development, model validation and model maintenance
- Leading projects and mentoring other scientists, engineers in the use of ML techniques
- 3+ years of building models for business application experience
- PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
- Experience programming in Java, C++, Python or related language
- Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
- Experience in professional software development
- Experience in patents or publications at top-tier peer-reviewed conferences or journals
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The role demands a high-performing and flexible candidate who can take responsibility for success of the system and drive solutions from research, prototype, design, coding and deployment. We are looking for Applied Scientists to tackle challenging problems in the areas of information Extraction, Efficient crawling at internet scale. You should have depth and breadth of knowledge in text mining, information extraction from Visually Rich Documents, semi structured data (HTML) and machine learning. You should also have programming and design skills to manipulate Semi-Structured and unstructured data and systems that work at internet scale.
You will encounter many challenges, including:
- Scale (build models to handle billions of pages), - Accuracy (extreme requirements for precision and recall)
- Speed (generate predictions for millions of new or changed pages with low latency) - Diversity (models need to work across different languages, market places and data sources)
You will help us to
- Build a scalable system which can algorithmically extract information information from world wide web
- Intelligently cluster web pages, segment and classify regions , extract relevant information and structure the data available on semi-structured web pages
- Build systems that will use existing Knowledge Base to perform open information extraction at scale from visually rich documents.
Key job responsibilities
- Use AI, NLP and advances in LLMs/SLMs to create scalable solutions for business problems
- Efficiently Crawl web, Automate extraction of relevant information from large amounts of Visually Rich Documents and optimize key processes
- Design, develop, evaluate and deploy, innovative and highly scalable ML models
- Work closely with software engineering teams to drive real-time model implementations
- Establish scalable, efficient, automated processes for large scale model development, model validation and model maintenance
- Leading projects and mentoring other scientists, engineers in the use of ML techniques
Basic Qualifications
- 3+ years of building models for business application experience
- PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
- Experience programming in Java, C++, Python or related language
- Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
Preferred Qualifications
- Experience using Unix/Linux- Experience in professional software development
- Experience in patents or publications at top-tier peer-reviewed conferences or journals
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
1
0
0
Category:
Data Science Jobs
Tags: Data Mining Deep Learning Engineering Java Linux LLMs Machine Learning ML models NLP PhD Python Research Unstructured data
Perks/benefits: Conferences
Region:
Asia/Pacific
Country:
India
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Data Scientist II jobsSr. Data Engineer jobsStaff Data Scientist jobsBI Developer jobsPrincipal Data Engineer jobsStaff Machine Learning Engineer jobsData Manager jobsSenior AI Engineer jobsJunior Data Analyst jobsData Science Intern jobsData Science Manager jobsResearch Scientist jobsBusiness Data Analyst jobsPrincipal Software Engineer jobsData Specialist jobsLead Data Analyst jobsSoftware Engineer II jobsData Analyst Intern jobsSr. Data Scientist jobsData Engineer III jobsBI Analyst jobsJunior Data Engineer jobsAI/ML Engineer jobsDevOps Engineer jobsSoftware Engineer, Machine Learning jobs
Snowflake jobsEconomics jobsLinux jobsOpen Source jobsData Warehousing jobsAirflow jobsNoSQL jobsKafka jobsHadoop jobsMLOps jobsComputer Vision jobsGoogle Cloud jobsBanking jobsRDBMS jobsScala jobsJavaScript jobsPhysics jobsClassification jobsScikit-learn jobsKPIs jobsData warehouse jobsOracle jobsTerraform jobsStreaming jobsGitHub jobs
PostgreSQL jobsScrum jobsPySpark jobsR&D jobsPandas jobsSAS jobsLooker jobsCX jobsBigQuery jobsDistributed Systems jobsData Mining jobsJira jobsdbt jobsIndustrial jobsRedshift jobsRobotics jobsUnstructured data jobsReact jobsJenkins jobsMicroservices jobsData strategy jobsNumPy jobsRAG jobsPharma jobsGPT jobs