0%
0 votes, 0 avg
16

This quiz randomly generates 30 questions as asked in AWS Certified Machine Learning - Specialty (MLS-C01)

Congratulations!

AWS Certified Machine Learning

AWS Certified Machine Learning - Specialty (MLS-C01)

This quiz randomly generates 30 questions (in 60 mins) as asked in AWS Certified Machine Learning - Specialty (MLS-C01). The real MLS-C01 test has 65 questions and a total time of 180 minutes. Of these, 15 questions are underlined, and only 50 questions are scored. This test randomly generates 30 questions from our question bank. For best results, practice multiple times until you achieve 100% accuracy.

1 / 30

A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model
using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric. This workflow
will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-
through on data that goes stale every 24 hours.
With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease
costs, the Specialist wants to reconfigure the input hyperparameter range(s).
Which visualization will accomplish this?

2 / 30

A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The
Specialist needs to understand whether the model is more frequently overestimating or underestimating
the target.
What option can the Specialist use to determine whether it is overestimating or underestimating the target
value?

3 / 30

A Machine Learning Specialist is developing a custom video recommendation model for an application.
The dataset used to train this model is very large with millions of data points and is hosted in an Amazon
S3 bucket. The Specialist wants to avoid loading all of this data onto an Amazon SageMaker notebook
instance because it would take hours to move and will exceed the attached 5 GB Amazon EBS volume on
the notebook instance.
Which approach allows the Specialist to use all the data to train the model?

4 / 30

A Machine Learning Specialist deployed a model that provides product recommendations on a company's
website. Initially, the model was performing very well and resulted in customers buying more products on
average. However, within the past few months, the Specialist has noticed that the effect of product
recommendations has diminished and customers are starting to return to their original habits of spending
less. The Specialist is unsure of what happened, as the model has not changed from its initial deployment
over a year ago.
Which method should the Specialist try to improve model performance?

5 / 30

An office security agency conducted a successful pilot using 100 cameras installed at key locations within

the main office. Images from the cameras were uploaded to Amazon S3 and tagged using Amazon

Rekognition, and the results were stored in Amazon ES. The agency is now looking to expand the pilot into

a full production system using thousands of video cameras in its office locations globally. The goal is to

identify activities performed by non-employees in real time

Which solution should the agency consider?

6 / 30

An agency collects census information within a country to determine healthcare and social program needs
by province and city. The census form collects responses for approximately 500 questions from each
citizen.
Which combination of algorithms would provide the appropriate insights? (Select TWO.)

7 / 30

A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access
notebooks, train models, and deploy endpoints. To ensure the best operational performance, the Specialist
needs to be able to track how often the Scientists are deploying models, GPU and CPU utilization on the
deployed SageMaker endpoints, and all errors that are generated when an endpoint is invoked.
Which services are integrated with Amazon SageMaker to track this information? (Choose two.)

8 / 30

A Machine Learning Specialist has created a deep learning neural network model that performs well on the
training data but performs poorly on the test data.
Which of the following methods should the Specialist consider using to correct this? (Choose three.)

9 / 30

A Data Scientist is working on an application that performs sentiment analysis. The validation accuracy is
poor, and the Data Scientist thinks that the cause may be a rich vocabulary and a low average frequency
of words in the dataset.
Which tool should be used to improve the validation accuracy?

10 / 30

A Machine Learning team uses Amazon SageMaker to train an Apache MXNet handwritten digit classifier
model using a research dataset. The team wants to receive a notification when the model is overfitting.
Auditors want to view the Amazon SageMaker log activity report to ensure there are no unauthorized API
calls.
What should the Machine Learning team do to address the requirements with the least amount of code and
fewest steps?

11 / 30

A Machine Learning Specialist is building a model that will perform time series forecasting using Amazon
SageMaker. The Specialist has finished training the model and is now planning to perform load testing on
the endpoint so they can configure Auto Scaling for the model variant.
Which approach will allow the Specialist to review the latency, memory utilization, and CPU utilization
during the load test?

12 / 30

An interactive online dictionary wants to add a widget that displays words used in similar contexts. A
Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model
powering the widget.
What should the Specialist do to meet these requirements?

13 / 30

An online reseller has a large, multi-column dataset with one column missing 30% of its data. A Machine
Learning Specialist believes that certain columns in the dataset could be used to reconstruct the missing
data.
Which reconstruction approach should the Specialist use to preserve the integrity of the dataset?

14 / 30

During mini-batch training of a neural network for a classification problem, a Data Scientist notices that
training accuracy oscillates.
What is the MOST likely cause of this issue?

15 / 30

A company is setting up an Amazon SageMaker environment. The corporate data security policy does not
allow communication over the internet.
How can the company enable the Amazon SageMaker service without enabling direct internet access to
Amazon SageMaker notebook instances?

16 / 30

A Machine Learning Specialist is creating a new natural language processing application that processes a
dataset comprised of 1 million sentences. The aim is to then run Word2Vec to generate embeddings of the
sentences and enable different types of predictions.
Here is an example from the dataset:
"The quck BROWN FOX jumps over the lazy dog."
Which of the following are the operations the Specialist needs to perform to correctly sanitize and prepare
the data in a repeatable manner? (Choose three.)

17 / 30

A Machine Learning Specialist receives customer data for an online shopping website. The data includes
demographics, past visits, and locality information. The Specialist must develop a machine learning
approach to identify the customer shopping patterns, preferences, and trends to enhance the website-for
better service and smart recommendations.
Which solution should the Specialist recommend?

18 / 30

A Data Science team is designing a dataset repository where it will store a large amount of training data
commonly used in its machine learning models. As Data Scientists may create an arbitrary number of new
datasets every day, the solution has to scale automatically and be cost-effective. Also, it must be possible
to explore the data using SQL.
Which storage scheme is MOST adapted to this scenario?

19 / 30

A Machine Learning Specialist is working with a large company to leverage machine learning within its
products. The company wants to group its customers into categories based on which customers will and
will not churn within the next 6 months. The company has labeled the data available to the Specialist.
Which machine learning model type should the Specialist use to accomplish this task?

20 / 30

A Machine Learning Specialist is building a logistic regression model that will predict whether or not a person will order a pizza. The Specialist is trying to build the optimal model with an ideal classification threshold. What model evaluation technique should the Specialist use to understand how different classification thresholds will impact the model's performance?

21 / 30

A Machine Learning Specialist is designing a system for improving sales for a company. The objective is to
use the large amount of information the company has on users' behavior and product preferences to
predict which products users would like based on the users' similarity to other users.
What should the Specialist do to meet this objective?

22 / 30

A Data Engineer needs to build a model using a dataset containing customer credit card information
How can the Data Engineer ensure the data remains encrypted and the credit card information is secure?

23 / 30

A manufacturing company has a large set of labeled historical sales data. The manufacturer would like to
predict how many units of a particular part should be produced each quarter.
Which machine learning approach should be used to solve this problem?

24 / 30

A retail company intends to use machine learning to categorize new products. A labeled dataset of current
products was provided to the Data Science team. The dataset includes 1,200 products. The labeled
dataset has 15 features for each product such as title dimensions, weight, and price. Each product is
labeled as belonging to one of six categories such as books, games, electronics, and movies.
Which model should be used for categorizing new products using the provided dataset for training?

25 / 30

A large mobile network operating company is building a machine learning model to predict customers who
are likely to unsubscribe from the service. The company plans to offer an incentive for these customers as
the cost of churn is far greater than the cost of the incentive.
The model produces the following confusion matrix after evaluating on a test dataset of 100 customers:

Based on the model evaluation results, why is this a viable model for production?

26 / 30

Machine Learning Specialist is working with a media company to perform classification on popular articles
from the company's website. The company is using random forests to classify how popular an article will
be before it is published. A sample of the data being used is below.
Given the dataset, the Specialist wants to convert the Day_Of_Week column to binary values.
What technique should be used to convert this column to binary values?

27 / 30

A large consumer goods manufacturer has the following products on sale:
1. 34 different toothpaste variants
2. 48 different toothbrush variants
3. 43 different mouthwash variants
The entire sales history of all these products is available in Amazon S3. Currently, the company is using
custom-built autoregressive integrated moving average (ARIMA) models to forecast demand for these
products. The company wants to predict the demand for a new product that will soon be launched.
Which solution should a Machine Learning Specialist apply?

28 / 30

A Data Scientist is developing a machine learning model to predict future patient outcomes based on
information collected about each patient and their treatment plans. The model should output a continuous
value as its prediction. The data available includes labeled outcomes for a set of 4,000 patients. The study
was conducted on a group of individuals over the age of 65 who have a particular disease that is known to
worsen with age.
Initial models have performed poorly. While reviewing the underlying data, the Data Scientist notices that,
out of 4,000 patient observations, there are 450 where the patient age has been input as 0. The other
features for these observations appear normal compared to the rest of the sample population.
How should the Data Scientist correct this issue?

29 / 30

A Machine Learning Specialist is preparing data for training on Amazon SageMaker. The Specialist is
using one of the SageMaker built-in algorithms for the training. The dataset is stored in .CSV format and is
transformed into a numpy.array, which appears to be negatively affecting the speed of the training.
What should the Specialist do to optimize the data for training on SageMaker?

30 / 30

A company's Machine Learning Specialist needs to improve the training speed of a time-series forecasting
model using TensorFlow. The training is currently implemented on a single-GPU machine and takes
approximately 23 hours to complete. The training needs to be run daily.
The model accuracy is acceptable, but the company anticipates a continuous increase in the size of the
training data and a need to update the model on an hourly, rather than a daily, basis. The company also
wants to minimize coding effort and infrastructure changes.
What should the Machine Learning Specialist do to the training solution to allow it to scale for future
demand?

Your score is

0%

Scroll to Top