Sequence Modeling is the task of predicting what word/letter comes next. Unlike the FNN and CNN, in sequence modeling, the current output is dependent on the previous input and the length of the input is not fixed.

Let’s look at the problem of auto-complete in the context of sequence modeling. In this problem, whenever we type a character (**d**) the system tries to predict the next possible character based on the previously typed character.

In other words, the network tries to predict the next character from the possible 26 English alphabets given that we have typed ‘**d**’. The neural network would have a softmax output of size 26 representing the probability of the next letter given the previous letters. Since the inputs to this network are characters we need to convert them to a one-hot encoded vector of size 26 and the element corresponding to the index of the alphabet would be set to 1, everything else is set to 0.

Q.1 The main drawback of basic RNN is ** __**.

- Exploding gradient problem
- Vanishing gradient problem
- Overfitting
- Large training parameters

Correct Answer is – Vanishing gradient problem

Q.2 Feedforward network cannot be used on sequence data because **__**.

- You cannot pass hidden states to next input
- Dependencies on previous inputs will not be considered
- Of the varying length of input data
- All the options

Correct Answer is – All the options

Q.3 Which of the following are the applications of RNN?

- Time series prediction
- Machine translation
- Intent extraction
- All the options

Correct Answer is – Intent extraction

Q.4 Error function in RNN is computed as **_____**.

- Summation of errors at individual time step
- Sum of errors at the first few time steps
- Weighted sum of error values at each time step
- Error value only at the final time step

Correct Answer is – Summation of errors at individual time step

Q.5 Back propagation through time is computed as **__**.

- Summation of derivatives of parameters on error function over the first few time steps
- Summation of derivatives of parameters on error function across each time step
- Product of derivatives of parameters on error function across each time step
- Derivatives of parameters on error function at the final time step

Correct Answer is – Summation of derivatives of parameters on error function across each time step

Q.6 Which of the following holds the memory in RNNs?

- Error function
- Hidden state
- Output state
- Input state

Correct Answer is – Error function

Q.7 Which of the following is not an example of sequential data?

- Time series data
- Text data
- Speech data
- Image data

Correct Answer is – Image data

Q.8 The basic RNN fails when **___________**.

- Input sequence is too small
- Input is of fixed length
- Input sequence is very long
- There is temporal dependencies in the data

Correct Answer is – Input sequence is very long

Q.9 In which of the following, the vanishing gradient problem is profound? —

- GRU
- Basic RNN
- LSTM

Correct Answer is – Basic RNN

Q.10 In the case of LSTM, which of the following is used to compute the output of a cell?

- Hidden state
- Cell state
- Input state
- Reset state

Correct Answer is – Hidden state

Q.11 Which of the following gates (in LSTM) decides on keeping relevant features from the current input?

- Input gate
- Forget gate
- Output gate
- Update gate

Correct Answer is – Update gate

**done until here *

Q.12 Which of the following is an additional feature of LSTM when compared to basic RNN?

- Hidden state
- Cell state
- Output state
- Input State

Correct Answer is – Cell state

Q.13 Which of the following is not a feature of LSTM?

- Hidden state
- Reset gate
- Update gate
- Cell state

Correct Answer is – Hidden state

Q.14 GRU has more parameters to learn when compared to LSTM.

- True
- False

Correct Answer is – False

Q.15 LSTM has a large number of parameters to learn when compared to basic RNN.

- True
- False

Correct Answer is – True

Q.16 Which of the following is not the feature of GRU?

- Hidden state
- Reset gate
- Update gate
- Cell state

Correct Answer is – Cell state

Q.17 The reason feed forward network cannot be used on sequence data is **_**.

- Dependencies on previous inputs will not be considered
- Cannot pass hidden states to next input
- Varying length of input data
- All the options

Correct Answer is – Varying length of input data

Q.18 Which of the following is the component of GRU?

- Forget gate
- Reset gate
- Input gate
- Cell state

Correct Answer is – Reset gate

Q.19 In the case of LSTM, which of the following is used to compute the output of a cell?

- Cell state
- Hidden state
- Input state
- Reset state

Correct Answer is – Cell state

Q.20 Which of the following gates in LSTM decides on eliminating irrelevant features from previous information?

- Input Gate
- Update gate
- Forget gate
- Reset gate

Correct Answer is – Forget gate

Q.21 Which of the following gates in LSTM decides on keeping relevant features from the current input?

- Output gate
- Forget gate
- Update gate
- Input gate

Correct Answer is – Input gate

Q.22 Which of the following considers the information from previous timestep, current ones, as well as future timesteps?

- GRU
- LSTM
- RNN
- Bidirectional RNN

Correct Answer is – Bidirectional RNN

Q.23 What is meant by sequence data?

- The data with a temporal relationship between them
- The data with no relationship with other data
- The data with a spacial relationship between them
- Data arranged in a sequence

Correct Answer is – The data with a temporal relationship between them

Q.24 What is the entity that is carried from one timestep to next in RNN?

- Concatenated inputs from previous timesteps
- Final output
- Previous timestep input
- Hidden state

Correct Answer is – Hidden state

Q.25 De-noising and Contractive are examples of *__*

- Shallow Neural Networks
- Autoencoders
- Convolution Neural Networks
- Recurrent Neural Networks

Correct Answer is – Autoencoders

Q.26 How do RNTS interpret words?

- One Hot Encoding
- Lower Case Versions
- Word Frequencies
- Vector Representations

Correct Answer is – Vector Representations

Q.27 Autoencoders are trained using **___**.

- Feed Forward
- Reconstruction
- Back Propagation
- They do not require Training

Correct Answer is – Back Propagation

Q.28 The rate at which cost changes with respect to weight or bias is called **______**.

- Derivative
- Gradient
- Loss
- Rate of Change

Correct Answer is – Gradient

Q.29 A **_____** matches or surpasses the output of an individual neuron to a visual stimuli.

- Max Pooling
- Gradient
- Cost
- Convolution

Correct Answer is – Convolution

Q.30 Autoencoders cannot be used for Dimensionality Reduction.

- True
- False

Correct Answer is – False

Q.1 What is the difference between the actual output and generated output known as?

- Output Modulus
- Accuracy
- Cost
- Output Difference

Correct Answer is – Cost

Q.2 Prediction Accuracy of a Neural Network depends on **___** and

**.**

*__*- Input and Output
- Weight and Bias
- Linear and Logistic Function
- Activation and Threshold

Correct Answer is – Weight and Bias

Q.3 GPU stands for ** __**.

- Graphics Processing Unit
- Gradient Processing Unit
- General Processing Unit
- Good Processing Unit

Correct Answer is – Graphics Processing Unit

Q.4 Recurrent Neural Networks are best suited for Text Processing.

- True
- False

Correct Answer is – True

Q.5 Recurrent Networks work best for Speech Recognition.

- True
- False

Correct Answer is – True

Q.6 Gradient at a given layer is the product of all gradients at the previous layers.

- True
- False

Correct Answer is – True

Q.7 Neural Networks Algorithms are inspired from the structure and functioning of the Human Biological Neuron.

- True
- False

Correct Answer is – True

Q.8 In a Neural Network, all the edges and nodes have the same Weight and Bias values.

- True
- False

Correct Answer is – False

Q.9 ** __** is a Neural Nets way of classifying inputs.

- Learning
- Forward Propagation
- Activation
- Classification

Correct Answer is – Forward Propagation

Q.10 Name the component of a Neural Network where the true value of the input is not observed.

- Hidden Layer
- Gradient Descent
- Activation Function
- Output Layer

Correct Answer is – Hidden Layer

Q.11 **__** works best for Image Data.

- AutoEncoders
- Single Layer Perceptrons
- Convolution Networks
- Random Forest

Correct Answer is – Convolution Networks

Q.12 **_** is a recommended Model for Pattern Recognition in Unlabeled Data.

- CNN
- RNN
- Autoencoders
- Shallow Neural Networks

Correct Answer is – Autoencoders

Q.13 Process of improving the accuracy of a Neural Network is called **___**.

- Forward Propagation
- Cross Validation
- Random Walk
- Training

Correct Answer is – Training

Q.14 Data Collected from Survey results is an example of **__**.

- Data
- Information
- Structured Data
- Unstructured Data

Correct Answer is – Structured Data

Q.15 Support Vector Machines, Naive Bayes and Logistic Regression are used for solving **_** problems.

- Clustering
- Classification
- Regression
- Time Series

Correct Answer is – Classification

Q.16 The rate at which cost changes with respect to weight or bias is called **__**.

- Derivative
- Gradient
- Rate of Change
- Loss

Correct Answer is – Gradient

Q.17 What does LSTM stand for?

- Long Short Term Memory
- Least Squares Term Memory
- Least Square Time Mean
- Long Short Threshold Memory

Correct Answer is – Long Short Term Memory

Q.18 A Shallow Neural Network has only one hidden layer between Input and Output layers.

- True
- False

Correct Answer is – True

Q.19 All the Visible Layers in a Restricted Boltzmannn Machine are connected to each other.

- True
- False

Correct Answer is – False

Q.20 All the neurons in a convolution layer have different Weights and Biases.

- True
- False

Correct Answer is – False

Q.21 Recurrent Network can input Sequence of Data Points and Produce a Sequence of Output.

- True
- False

Correct Answer is – True

Q.22 A Deep Belief Network is a stack of Restricted Boltzmann Machines.

- True
- False

Correct Answer is – True

Q.23 Restricted Boltzmann Machine expects the data to be labeled for Training.

- True
- False

Correct Answer is – False

Q.24 What is the method to overcome the Decay of Information through time in RNN known as?

- Back Propagation
- Gradient Descent
- Activation
- Gating

Correct Answer is – Gating

Q.25 What is the best Neural Network Model for Temporal Data?

- Recurrent Neural Network
- Convolution Neural Networks
- Temporal Neural Networks
- Multi Layer Perceptrons

Correct Answer is – Recurrent Neural Network

Q.26 RELU stands for **__**.

- Rectified Linear Unit
- Rectified Lagrangian Unit
- Regressive Linear Unit
- Regressive Lagrangian Unit

Correct Answer is – Rectified Linear Unit

Q.27 Why is the Pooling Layer used in a Convolution Neural Network?

- They are of no use in CNN
- Dimension Reduction
- Image Sensing
- Object Recognition

Correct Answer is – Dimension Reduction

Q.28 What are the two layers of a Restricted Boltzmann Machine called?

- Input and Output Layers
- Recurrent and Convolution Layers
- Activation and Threshold Layers
- Hidden and Visible Layers

Correct Answer is – Hidden and Visible Layers

Q.29 The measure of Difference between two probability distributions is know as **____**.

- Probability Difference
- Cost
- KL Divergence
- Error

Correct Answer is – KL Divergence