Applied Scientist, AGI Info
El Segundo, California, USA
Full Time Mid-level / Intermediate USD 136K - 223K
Amazon.com
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...
Amazon is seeking an exceptional Senior Applied Scientist to join AGI Info Content team. In this role, you will be at the forefront of developing and enhancing the intelligence of AmazonBot crawler and content processing. The team is a key enabler of Amazon's AGI initiatives such as data pipelines for Olympus model training and collecting data for AGI Info grounding services. Our systems operate on web scale. This requires great combination of innovation to utilize all SOTA ML techniques in combination with model optimization to operate on 100k+ requests/decision per second. Your work will directly impact the quality and efficiency of our data acquisition efforts, ultimately benefiting millions of customers worldwide.
Key job responsibilities
- Design, develop, and implement advanced algorithms and machine learning models to improve the intelligence and effectiveness of our web crawler and content processing pipelines.
- Collaborate with cross-functional teams to identify and prioritize crawling targets, ensuring alignment with business objectives
- Analyze and optimize crawling strategies to maximize coverage, freshness, and quality of acquired data while minimizing operational costs as well as dive deep into data to select the highest quality data for LLM model training and grounding.
- Conduct in-depth research to stay at the forefront of web acquisition and processing.
- Develop and maintain scalable, fault-tolerant systems to handle the vast scale of Amazon's web crawling operations
- Monitor and analyze performance metrics, identifying opportunities for improvement and implementing data-driven optimizations
- Mentor and guide junior team members, fostering a culture of innovation and continuous learning
- 3+ years of building models for business application experience
- PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
- Experience programming in Java, C++, Python or related language
- Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
- Experience in professional software development
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $136,000/year in our lowest geographic market up to $223,400/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.
Key job responsibilities
- Design, develop, and implement advanced algorithms and machine learning models to improve the intelligence and effectiveness of our web crawler and content processing pipelines.
- Collaborate with cross-functional teams to identify and prioritize crawling targets, ensuring alignment with business objectives
- Analyze and optimize crawling strategies to maximize coverage, freshness, and quality of acquired data while minimizing operational costs as well as dive deep into data to select the highest quality data for LLM model training and grounding.
- Conduct in-depth research to stay at the forefront of web acquisition and processing.
- Develop and maintain scalable, fault-tolerant systems to handle the vast scale of Amazon's web crawling operations
- Monitor and analyze performance metrics, identifying opportunities for improvement and implementing data-driven optimizations
- Mentor and guide junior team members, fostering a culture of innovation and continuous learning
Basic Qualifications
- 3+ years of building models for business application experience
- PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
- Experience programming in Java, C++, Python or related language
- Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
Preferred Qualifications
- Experience using Unix/Linux- Experience in professional software development
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $136,000/year in our lowest geographic market up to $223,400/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.
Job stats:
0
0
0
Category:
Data Science Jobs
Tags: AGI Data Mining Data pipelines Java Linux LLMs Machine Learning ML models Model training PhD Pipelines Python Research
Perks/benefits: Career development Equity / stock options Health care
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Data Scientist II jobsData Engineer II jobsSr. Data Engineer jobsBusiness Intelligence Analyst jobsPrincipal Data Engineer jobsStaff Data Scientist jobsStaff Machine Learning Engineer jobsData Science Manager jobsData Manager jobsPrincipal Software Engineer jobsData Science Intern jobsJunior Data Analyst jobsBusiness Data Analyst jobsSoftware Engineer II jobsData Analyst Intern jobsDevOps Engineer jobsData Specialist jobsSr. Data Scientist jobsLead Data Analyst jobsStaff Software Engineer jobsResearch Scientist jobsData Engineer III jobsAI/ML Engineer jobsSenior Backend Engineer jobsBI Analyst jobs
NLP jobsAirflow jobsOpen Source jobsEconomics jobsLinux jobsKafka jobsMLOps jobsTerraform jobsNoSQL jobsJavaScript jobsComputer Vision jobsKPIs jobsPhysics jobsGoogle Cloud jobsData Warehousing jobsRDBMS jobsPostgreSQL jobsScikit-learn jobsBanking jobsScala jobsHadoop jobsGitHub jobsPandas jobsData warehouse jobsStreaming jobs
BigQuery jobsR&D jobsClassification jobsOracle jobsCX jobsdbt jobsPySpark jobsLooker jobsDistributed Systems jobsScrum jobsReact jobsRAG jobsRobotics jobsMicroservices jobsRedshift jobsJira jobsSAS jobsIndustrial jobsGPT jobsPrompt engineering jobsELT jobsMySQL jobsNumPy jobsData Mining jobsData strategy jobs