0%
0 votes, 0 avg
16

This quiz randomly generates 30 questions as asked in AWS Certified Machine Learning - Specialty (MLS-C01)

Congratulations!

AWS Certified Machine Learning

AWS Certified Machine Learning - Specialty (MLS-C01)

This quiz randomly generates 30 questions (in 60 mins) as asked in AWS Certified Machine Learning - Specialty (MLS-C01). The real MLS-C01 test has 65 questions and a total time of 180 minutes. Of these, 15 questions are underlined, and only 50 questions are scored. This test randomly generates 30 questions from our question bank. For best results, practice multiple times until you achieve 100% accuracy.

1 / 30

When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Choose three.)

2 / 30

A manufacturing company has a large set of labeled historical sales data. The manufacturer would like to
predict how many units of a particular part should be produced each quarter.
Which machine learning approach should be used to solve this problem?

3 / 30

A Machine Learning Specialist at a company sensitive to security is preparing a dataset for model training.
The dataset is stored in Amazon S3 and contains Personally Identifiable Information (PII).
The dataset:
Must be accessible from a VPC only.
Must not traverse the public internet.
How can these requirements be satisfied?

4 / 30

A Machine Learning Specialist has created a deep learning neural network model that performs well on the
training data but performs poorly on the test data.
Which of the following methods should the Specialist consider using to correct this? (Choose three.)

5 / 30

A Data Scientist needs to create a serverless ingestion and analytics solution for high-velocity, real-time
streaming data.
The ingestion process must buffer and convert incoming records from JSON to a query-optimized,
columnar format without data loss. The output datastore must be highly available, and Analysts must be
able to run SQL queries against the data and connect to existing business intelligence dashboards.
Which solution should the Data Scientist build to satisfy the requirements?

6 / 30

A Mobile Network Operator is building an analytics platform to analyze and optimize a company's
operations using Amazon Athena and Amazon S3. The source systems send data in .CSV format in real time. The Data Engineering team wants to transform the data to the Apache Parquet format before storing it on Amazon S3. Which solution takes the LEAST effort to implement?

7 / 30

A Machine Learning Specialist is building a logistic regression model that will predict whether or not a person will order a pizza. The Specialist is trying to build the optimal model with an ideal classification threshold. What model evaluation technique should the Specialist use to understand how different classification thresholds will impact the model's performance?

8 / 30

During mini-batch training of a neural network for a classification problem, a Data Scientist notices that
training accuracy oscillates.
What is the MOST likely cause of this issue?

9 / 30

A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this data. Which solution requires the LEAST effort to be able to query this data?

10 / 30

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a
Machine Learning Specialist would like to build a binary classifier based on two features: age of account
and transaction month. The class distribution for these features is illustrated in the figure provided.

Based on this information, which model would have the HIGHEST accuracy?

11 / 30

A Machine Learning Specialist is assigned a TensorFlow project using Amazon SageMaker for training,
and needs to continue working for an extended period with no Wi-Fi access.
Which approach should the Specialist use to continue working?

12 / 30

A Machine Learning Specialist is required to build a supervised image-recognition model to identify a cat.
The ML Specialist performs some tests and records the following results for a neural network-based image
classifier:
Total number of images available = 1,000
Test set images = 100 (constant test set)
The ML Specialist notices that, in over 75% of the misclassified images, the cats were held upside down by
their owners.
Which techniques can be used by the ML Specialist to improve this specific test error?

13 / 30

A large consumer goods manufacturer has the following products on sale:
1. 34 different toothpaste variants
2. 48 different toothbrush variants
3. 43 different mouthwash variants
The entire sales history of all these products is available in Amazon S3. Currently, the company is using
custom-built autoregressive integrated moving average (ARIMA) models to forecast demand for these
products. The company wants to predict the demand for a new product that will soon be launched.
Which solution should a Machine Learning Specialist apply?

14 / 30

A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The
Specialist needs to understand whether the model is more frequently overestimating or underestimating
the target.
What option can the Specialist use to determine whether it is overestimating or underestimating the target
value?

15 / 30

A Data Scientist wants to gain real-time insights into a data stream of GZIP files.
Which solution would allow the use of SQL to query the stream with the LEAST latency?

16 / 30

A company is observing low accuracy while training on the default built-in image classification algorithm in
Amazon SageMaker. The Data Science team wants to use an Inception neural network architecture
instead of a ResNet architecture.
Which of the following will accomplish this? (Choose two.)

17 / 30

A company is using Amazon Polly to translate plaintext documents to speech for automated company
announcements. However, company acronyms are being mispronounced in the current documents.
How should a Machine Learning Specialist address this issue for future documents?

18 / 30

A Machine Learning Specialist built an image classification deep learning model. However, the Specialist
ran into an overfitting problem in which the training and testing accuracies were 99% and 75%,
respectively.
How should the Specialist address this issue and what is the reason behind it?

19 / 30

A large mobile network operating company is building a machine learning model to predict customers who
are likely to unsubscribe from the service. The company plans to offer an incentive for these customers as
the cost of churn is far greater than the cost of the incentive.
The model produces the following confusion matrix after evaluating on a test dataset of 100 customers:

Based on the model evaluation results, why is this a viable model for production?

20 / 30

A monitoring service generates 1 TB of scale metrics record data every minute. A Research team performs
queries on this data using Amazon Athena. The queries run slowly due to the large volume of data, and the
team requires better performance. How should the records be stored in Amazon S3 to improve query performance?

21 / 30

An online reseller has a large, multi-column dataset with one column missing 30% of its data. A Machine
Learning Specialist believes that certain columns in the dataset could be used to reconstruct the missing
data.
Which reconstruction approach should the Specialist use to preserve the integrity of the dataset?

22 / 30

A gaming company has launched an online game where people can start playing for free, but they need to
pay if they choose to use certain features. The company needs to build an automated system to predict
whether or not a new user will become a paid user within 1 year. The company has gathered a labeled
dataset from 1 million users.
The training dataset consists of 1,000 positive samples (from users who ended up paying within 1 year)
and 999,000 negative samples (from users who did not use any paid features). Each data sample consists
of 200 features including user age, device, location, and play patterns.
Using this dataset for training, the Data Science team trained a random forest model that converged with
over 99% accuracy on the training set. However, the prediction results on a test dataset were not
satisfactory
Which of the following approaches should the Data Science team take to mitigate this issue? (Choose
two.)

23 / 30

A Machine Learning Specialist must build out a process to query a dataset on Amazon S3 using Amazon
Athena. The dataset contains more than 800,000 records stored as plaintext CSV files. Each record
contains 200 columns and is approximately 1.5 MB in size. Most queries will span 5 to 10 columns only.
How should the Machine Learning Specialist transform the dataset to minimize query runtime?

24 / 30

An agency collects census information within a country to determine healthcare and social program needs
by province and city. The census form collects responses for approximately 500 questions from each
citizen.
Which combination of algorithms would provide the appropriate insights? (Select TWO.)

25 / 30

A Machine Learning Specialist working for an online fashion company wants to build a data ingestion
solution for the company's Amazon S3-based data lake.
The Specialist wants to create a set of ingestion mechanisms that will enable future capabilities comprised
of:
Real-time analytics
Interactive analytics of historical data
Clickstream analytics
Product recommendations
Which services should the Specialist use?

26 / 30

Which of the following metrics should a Machine Learning Specialist generally use to compare/evaluate machine learning classification models against each other?

27 / 30

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a
Machine Learning Specialist would like to build a binary classifier based on two features: age of account
and transaction month. The class distribution for these features is illustrated in the figure provided.

Based on this information, which model would have the HIGHEST recall with respect to the fraudulent class?

28 / 30

Machine Learning Specialist is working with a media company to perform classification on popular articles
from the company's website. The company is using random forests to classify how popular an article will
be before it is published. A sample of the data being used is below.
Given the dataset, the Specialist wants to convert the Day_Of_Week column to binary values.
What technique should be used to convert this column to binary values?

29 / 30

An interactive online dictionary wants to add a widget that displays words used in similar contexts. A
Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model
powering the widget.
What should the Specialist do to meet these requirements?

30 / 30

A retail company intends to use machine learning to categorize new products. A labeled dataset of current
products was provided to the Data Science team. The dataset includes 1,200 products. The labeled
dataset has 15 features for each product such as title dimensions, weight, and price. Each product is
labeled as belonging to one of six categories such as books, games, electronics, and movies.
Which model should be used for categorizing new products using the provided dataset for training?

Your score is

0%

Scroll to Top