0%
0 votes, 0 avg
17

This quiz randomly generates 30 questions as asked in AWS Certified Machine Learning - Specialty (MLS-C01)

Congratulations!

AWS Certified Machine Learning

AWS Certified Machine Learning - Specialty (MLS-C01)

This quiz randomly generates 30 questions (in 60 mins) as asked in AWS Certified Machine Learning - Specialty (MLS-C01). The real MLS-C01 test has 65 questions and a total time of 180 minutes. Of these, 15 questions are underlined, and only 50 questions are scored. This test randomly generates 30 questions from our question bank. For best results, practice multiple times until you achieve 100% accuracy.

1 / 30

A Machine Learning Specialist is assigned a TensorFlow project using Amazon SageMaker for training,
and needs to continue working for an extended period with no Wi-Fi access.
Which approach should the Specialist use to continue working?

2 / 30

A Machine Learning Specialist is building a prediction model for a large number of features using linear
models, such as linear regression and logistic regression. During exploratory data analysis, the Specialist
observes that many features are highly correlated with each other. This may make the model unstable.
What should be done to reduce the impact of having such a large number of features?

3 / 30

The displayed graph is from a forecasting model for testing a time series.

Considering the graph only, which conclusion should a Machine Learning Specialist make about the
behavior of the model?

 

4 / 30

A company's Machine Learning Specialist needs to improve the training speed of a time-series forecasting
model using TensorFlow. The training is currently implemented on a single-GPU machine and takes
approximately 23 hours to complete. The training needs to be run daily.
The model accuracy is acceptable, but the company anticipates a continuous increase in the size of the
training data and a need to update the model on an hourly, rather than a daily, basis. The company also
wants to minimize coding effort and infrastructure changes.
What should the Machine Learning Specialist do to the training solution to allow it to scale for future
demand?

5 / 30

A Data Scientist is working on an application that performs sentiment analysis. The validation accuracy is
poor, and the Data Scientist thinks that the cause may be a rich vocabulary and a low average frequency
of words in the dataset.
Which tool should be used to improve the validation accuracy?

6 / 30

A Data Engineer needs to build a model using a dataset containing customer credit card information
How can the Data Engineer ensure the data remains encrypted and the credit card information is secure?

7 / 30

A Machine Learning Specialist deployed a model that provides product recommendations on a company's
website. Initially, the model was performing very well and resulted in customers buying more products on
average. However, within the past few months, the Specialist has noticed that the effect of product
recommendations has diminished and customers are starting to return to their original habits of spending
less. The Specialist is unsure of what happened, as the model has not changed from its initial deployment
over a year ago.
Which method should the Specialist try to improve model performance?

8 / 30

A monitoring service generates 1 TB of scale metrics record data every minute. A Research team performs
queries on this data using Amazon Athena. The queries run slowly due to the large volume of data, and the
team requires better performance. How should the records be stored in Amazon S3 to improve query performance?

9 / 30

A Machine Learning Specialist working for an online fashion company wants to build a data ingestion
solution for the company's Amazon S3-based data lake.
The Specialist wants to create a set of ingestion mechanisms that will enable future capabilities comprised
of:
Real-time analytics
Interactive analytics of historical data
Clickstream analytics
Product recommendations
Which services should the Specialist use?

10 / 30

A Machine Learning Specialist is building a convolutional neural network (CNN) that will classify 10 types
of animals. The Specialist has built a series of layers in a neural network that will take an input image of an
animal, pass it through a series of convolutional and pooling layers, and then finally pass it through a
dense and fully connected layer with 10 nodes. The Specialist would like to get an output from the neural
network that is a probability distribution of how likely it is that the input image belongs to each of the 10
classes.
Which function will produce the desired output?

11 / 30

An interactive online dictionary wants to add a widget that displays words used in similar contexts. A
Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model
powering the widget.
What should the Specialist do to meet these requirements?

12 / 30

During mini-batch training of a neural network for a classification problem, a Data Scientist notices that
training accuracy oscillates.
What is the MOST likely cause of this issue?

13 / 30

A company is running a machine learning prediction service that generates 100 TB of predictions every
day. A Machine Learning Specialist must generate a visualization of the daily precision-recall curve from
the predictions, and forward a read-only version to the Business team.
Which solution requires the LEAST coding effort?

14 / 30

A Machine Learning Specialist built an image classification deep learning model. However, the Specialist
ran into an overfitting problem in which the training and testing accuracies were 99% and 75%,
respectively.
How should the Specialist address this issue and what is the reason behind it?

15 / 30

A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored
in Amazon S3 buckets. The IT Security team is concerned that internet-enabled notebook instances create
a security vulnerability where malicious code running on the instances could compromise data privacy. The
company mandates that all instances stay within a secured VPC with no internet access, and data
communication traffic must stay within the AWS network.
How should the Data Science team configure the notebook instance placement to meet these
requirements?

16 / 30

A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this data. Which solution requires the LEAST effort to be able to query this data?

17 / 30

A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The
Specialist needs to understand whether the model is more frequently overestimating or underestimating
the target.
What option can the Specialist use to determine whether it is overestimating or underestimating the target
value?

18 / 30

A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model
using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric. This workflow
will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-
through on data that goes stale every 24 hours.
With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease
costs, the Specialist wants to reconfigure the input hyperparameter range(s).
Which visualization will accomplish this?

19 / 30

Machine Learning Specialist is building a model to predict future employment rates based on a wide range
of economic factors. While exploring the data, the Specialist notices that the magnitude of the input
features vary greatly. The Specialist does not want variables with a larger magnitude to dominate the
model.
What should the Specialist do to prepare the data for model training?

20 / 30

A company is setting up an Amazon SageMaker environment. The corporate data security policy does not
allow communication over the internet.
How can the company enable the Amazon SageMaker service without enabling direct internet access to
Amazon SageMaker notebook instances?

21 / 30

A Machine Learning Specialist is preparing data for training on Amazon SageMaker. The Specialist is
using one of the SageMaker built-in algorithms for the training. The dataset is stored in .CSV format and is
transformed into a numpy.array, which appears to be negatively affecting the speed of the training.
What should the Specialist do to optimize the data for training on SageMaker?

22 / 30

When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Choose three.)

23 / 30

A gaming company has launched an online game where people can start playing for free, but they need to
pay if they choose to use certain features. The company needs to build an automated system to predict
whether or not a new user will become a paid user within 1 year. The company has gathered a labeled
dataset from 1 million users.
The training dataset consists of 1,000 positive samples (from users who ended up paying within 1 year)
and 999,000 negative samples (from users who did not use any paid features). Each data sample consists
of 200 features including user age, device, location, and play patterns.
Using this dataset for training, the Data Science team trained a random forest model that converged with
over 99% accuracy on the training set. However, the prediction results on a test dataset were not
satisfactory
Which of the following approaches should the Data Science team take to mitigate this issue? (Choose
two.)

24 / 30

A Data Scientist is developing a machine learning model to classify whether a financial transaction is
fraudulent. The labeled data available for training consists of 100,000 non-fraudulent observations and
1,000 fraudulent observations.
The Data Scientist applies the XGBoost algorithm to the data, resulting in the following confusion matrix
when the trained model is applied to a previously unseen validation dataset. The accuracy of the model is
99.1%, but the Data Scientist has been asked to reduce the number of false negatives.
Which combination of steps should the Data Scientist take to reduce the number of false positive
predictions by the model? (Choose two.)

25 / 30

A Machine Learning Specialist is using an Amazon SageMaker notebook instance in a private subnet of a
corporate VPC. The ML Specialist has important data stored on the Amazon SageMaker notebook
instance's Amazon EBS volume, and needs to take a snapshot of that EBS volume. However, the ML
Specialist cannot find the Amazon SageMaker notebook instance's EBS volume or Amazon EC2 instance
within the VPC.
Why is the ML Specialist not seeing the instance visible in the VPC?

26 / 30

A Mobile Network Operator is building an analytics platform to analyze and optimize a company's
operations using Amazon Athena and Amazon S3. The source systems send data in .CSV format in real time. The Data Engineering team wants to transform the data to the Apache Parquet format before storing it on Amazon S3. Which solution takes the LEAST effort to implement?

27 / 30

A Machine Learning Specialist is building a logistic regression model that will predict whether or not a person will order a pizza. The Specialist is trying to build the optimal model with an ideal classification threshold. What model evaluation technique should the Specialist use to understand how different classification thresholds will impact the model's performance?

28 / 30

A financial services company is building a robust serverless data lake on Amazon S3. The data lake
should be flexible and meet the following requirements:
Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift
Spectrum.
Support event-driven ETL pipelines
Provide a quick and easy way to understand metadata
Which approach meets these requirements?

29 / 30

A Machine Learning Specialist is developing a daily ETL workflow containing multiple ETL jobs. The
workflow consists of the following processes:
• Start the workflow as soon as data is uploaded to Amazon S3.
• When all the datasets are available in Amazon S3, start an ETL job to join the uploaded datasets with
multiple terabyte-sized datasets already stored in Amazon S3.
• Store the results of joining datasets in Amazon S3.
• If one of the jobs fails, send a notification to the Administrator.
Which configuration will meet these requirements?

30 / 30

A Machine Learning Specialist must build out a process to query a dataset on Amazon S3 using Amazon
Athena. The dataset contains more than 800,000 records stored as plaintext CSV files. Each record
contains 200 columns and is approximately 1.5 MB in size. Most queries will span 5 to 10 columns only.
How should the Machine Learning Specialist transform the dataset to minimize query runtime?

Your score is

0%

Scroll to Top