0%
0 votes, 0 avg
16

This quiz randomly generates 30 questions as asked in AWS Certified Machine Learning - Specialty (MLS-C01)

Congratulations!

AWS Certified Machine Learning

AWS Certified Machine Learning - Specialty (MLS-C01)

This quiz randomly generates 30 questions (in 60 mins) as asked in AWS Certified Machine Learning - Specialty (MLS-C01). The real MLS-C01 test has 65 questions and a total time of 180 minutes. Of these, 15 questions are underlined, and only 50 questions are scored. This test randomly generates 30 questions from our question bank. For best results, practice multiple times until you achieve 100% accuracy.

1 / 30

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a
Machine Learning Specialist would like to build a binary classifier based on two features: age of account
and transaction month. The class distribution for these features is illustrated in the figure provided.

Based on this information, which model would have the HIGHEST recall with respect to the fraudulent class?

2 / 30

A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access
notebooks, train models, and deploy endpoints. To ensure the best operational performance, the Specialist
needs to be able to track how often the Scientists are deploying models, GPU and CPU utilization on the
deployed SageMaker endpoints, and all errors that are generated when an endpoint is invoked.
Which services are integrated with Amazon SageMaker to track this information? (Choose two.)

3 / 30

A Data Science team is designing a dataset repository where it will store a large amount of training data
commonly used in its machine learning models. As Data Scientists may create an arbitrary number of new
datasets every day, the solution has to scale automatically and be cost-effective. Also, it must be possible
to explore the data using SQL.
Which storage scheme is MOST adapted to this scenario?

4 / 30

A gaming company has launched an online game where people can start playing for free, but they need to
pay if they choose to use certain features. The company needs to build an automated system to predict
whether or not a new user will become a paid user within 1 year. The company has gathered a labeled
dataset from 1 million users.
The training dataset consists of 1,000 positive samples (from users who ended up paying within 1 year)
and 999,000 negative samples (from users who did not use any paid features). Each data sample consists
of 200 features including user age, device, location, and play patterns.
Using this dataset for training, the Data Science team trained a random forest model that converged with
over 99% accuracy on the training set. However, the prediction results on a test dataset were not
satisfactory
Which of the following approaches should the Data Science team take to mitigate this issue? (Choose
two.)

5 / 30

A Machine Learning Specialist has completed a proof of concept for a company using a small data sample,
and now the Specialist is ready to implement an end-to-end solution in AWS using Amazon SageMaker.
The historical training data is stored in Amazon RDS.
Which approach should the Specialist use for training a model using that data?

6 / 30

A Machine Learning Specialist is preparing data for training on Amazon SageMaker. The Specialist is
using one of the SageMaker built-in algorithms for the training. The dataset is stored in .CSV format and is
transformed into a numpy.array, which appears to be negatively affecting the speed of the training.
What should the Specialist do to optimize the data for training on SageMaker?

7 / 30

A Data Engineer needs to build a model using a dataset containing customer credit card information
How can the Data Engineer ensure the data remains encrypted and the credit card information is secure?

8 / 30

A company is running a machine learning prediction service that generates 100 TB of predictions every
day. A Machine Learning Specialist must generate a visualization of the daily precision-recall curve from
the predictions, and forward a read-only version to the Business team.
Which solution requires the LEAST coding effort?

9 / 30

An employee found a video clip with audio on a company's social media feed. The language used in the
video is Spanish. English is the employee's first language, and they do not understand Spanish. The
employee wants to do a sentiment analysis.
What combination of services is the MOST efficient to accomplish the task?

10 / 30

A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model
using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric. This workflow
will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-
through on data that goes stale every 24 hours.
With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease
costs, the Specialist wants to reconfigure the input hyperparameter range(s).
Which visualization will accomplish this?

11 / 30

An interactive online dictionary wants to add a widget that displays words used in similar contexts. A
Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model
powering the widget.
What should the Specialist do to meet these requirements?

12 / 30

A Machine Learning Specialist built an image classification deep learning model. However, the Specialist
ran into an overfitting problem in which the training and testing accuracies were 99% and 75%,
respectively.
How should the Specialist address this issue and what is the reason behind it?

13 / 30

An insurance company is developing a new device for vehicles that uses a camera to observe drivers'
behavior and alert them when they appear distracted. The company created approximately 10,000 training
images in a controlled environment that a Machine Learning Specialist will use to train and evaluate
machine learning models.
During the model evaluation, the Specialist notices that the training error rate diminishes faster as the
number of epochs increases and the model is not accurately inferring on the unseen test images.
Which of the following should be used to resolve this issue? (Choose two.)

14 / 30

A Machine Learning team uses Amazon SageMaker to train an Apache MXNet handwritten digit classifier
model using a research dataset. The team wants to receive a notification when the model is overfitting.
Auditors want to view the Amazon SageMaker log activity report to ensure there are no unauthorized API
calls.
What should the Machine Learning team do to address the requirements with the least amount of code and
fewest steps?

15 / 30

A manufacturing company has a large set of labeled historical sales data. The manufacturer would like to
predict how many units of a particular part should be produced each quarter.
Which machine learning approach should be used to solve this problem?

16 / 30

A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The
Specialist needs to understand whether the model is more frequently overestimating or underestimating
the target.
What option can the Specialist use to determine whether it is overestimating or underestimating the target
value?

17 / 30

Machine Learning Specialist is training a model to identify the make and model of vehicles in images. The
Specialist wants to use transfer learning and an existing model trained on images of general objects. The
Specialist collated a large custom dataset of pictures containing different vehicle makes and models.
What should the Specialist do to initialize the model to re-train it with the custom data?

18 / 30

A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using
Amazon Kinesis Data Firehose. To support training an improved machine learning model, training records
will require new but simple transformations, and some attributes will be combined. The model needs to be
retrained daily.
Given the large number of stores and the legacy data ingestion, which change will require the LEAST
amount of development effort?

19 / 30

A Machine Learning Specialist must build out a process to query a dataset on Amazon S3 using Amazon
Athena. The dataset contains more than 800,000 records stored as plaintext CSV files. Each record
contains 200 columns and is approximately 1.5 MB in size. Most queries will span 5 to 10 columns only.
How should the Machine Learning Specialist transform the dataset to minimize query runtime?

20 / 30

The displayed graph is from a forecasting model for testing a time series.

Considering the graph only, which conclusion should a Machine Learning Specialist make about the
behavior of the model?

 

21 / 30

A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored
in Amazon S3 buckets. The IT Security team is concerned that internet-enabled notebook instances create
a security vulnerability where malicious code running on the instances could compromise data privacy. The
company mandates that all instances stay within a secured VPC with no internet access, and data
communication traffic must stay within the AWS network.
How should the Data Science team configure the notebook instance placement to meet these
requirements?

22 / 30

A Marketing Manager at a pet insurance company plans to launch a targeted marketing campaign on
social media to acquire new customers. Currently, the company has the following data in Amazon Aurora:
Profiles for all past and existing customers
Profiles for all past and existing insured pets
Policy-level information
Premiums received
Claims paid
What steps should be taken to implement a machine learning model to identify potential new customers on
social media?

23 / 30

A Machine Learning Specialist is assigned a TensorFlow project using Amazon SageMaker for training,
and needs to continue working for an extended period with no Wi-Fi access.
Which approach should the Specialist use to continue working?

24 / 30

A Machine Learning Specialist is working with a large cybersecurity company that manages security
events in real time for companies around the world. The cybersecurity company wants to design a solution
that will allow it to use machine learning to score malicious events as anomalies on the data as it is being
ingested. The company also wants be able to save the results in its data lake for later processing and
analysis.
What is the MOST efficient way to accomplish these tasks?

25 / 30

A Machine Learning Specialist is building a logistic regression model that will predict whether or not a person will order a pizza. The Specialist is trying to build the optimal model with an ideal classification threshold. What model evaluation technique should the Specialist use to understand how different classification thresholds will impact the model's performance?

26 / 30

A Machine Learning Specialist is using an Amazon SageMaker notebook instance in a private subnet of a
corporate VPC. The ML Specialist has important data stored on the Amazon SageMaker notebook
instance's Amazon EBS volume, and needs to take a snapshot of that EBS volume. However, the ML
Specialist cannot find the Amazon SageMaker notebook instance's EBS volume or Amazon EC2 instance
within the VPC.
Why is the ML Specialist not seeing the instance visible in the VPC?

27 / 30

A Mobile Network Operator is building an analytics platform to analyze and optimize a company's
operations using Amazon Athena and Amazon S3. The source systems send data in .CSV format in real time. The Data Engineering team wants to transform the data to the Apache Parquet format before storing it on Amazon S3. Which solution takes the LEAST effort to implement?

28 / 30

A Data Scientist needs to create a serverless ingestion and analytics solution for high-velocity, real-time
streaming data.
The ingestion process must buffer and convert incoming records from JSON to a query-optimized,
columnar format without data loss. The output datastore must be highly available, and Analysts must be
able to run SQL queries against the data and connect to existing business intelligence dashboards.
Which solution should the Data Scientist build to satisfy the requirements?

29 / 30

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a
Machine Learning Specialist would like to build a binary classifier based on two features: age of account
and transaction month. The class distribution for these features is illustrated in the figure provided.

Based on this information, which model would have the HIGHEST accuracy?

30 / 30

During mini-batch training of a neural network for a classification problem, a Data Scientist notices that
training accuracy oscillates.
What is the MOST likely cause of this issue?

Your score is

0%

Scroll to Top