TensorFlow Interview Questions and Answers

What is TensorFlow?

TensorFlow is an open-source end-to-end platform for machine learning. It provides a comprehensive ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

Who developed TensorFlow?

TensorFlow was developed by the Google Brain team.

What are the key features of TensorFlow?

**Flexible Architecture:** Allows deployment on various platforms (CPUs, GPUs, TPUs, mobile, web, cloud).
**Computation Graph:** Uses a dataflow graph to represent computations, which enables optimizations. (More prominent in TF1.x, but still underlies TF2.x).
**Automatic Differentiation:** Supports automatic differentiation for training neural networks efficiently.
**Scalability:** Designed for large-scale distributed training.
**TensorBoard:** A powerful visualization tool for monitoring training, debugging, and visualizing graphs.
**Keras API:** High-level API integrated into TensorFlow for rapid prototyping and model building.
**Eager Execution:** An imperative programming environment that evaluates operations immediately, making it easier to debug and iterate. (Default in TF2.x).
**SavedModel:** A universal serialization format for TensorFlow models.

What is the difference between TensorFlow 1.x and TensorFlow 2.x?

**Eager Execution:** TF2.x defaults to Eager Execution, while TF1.x defaulted to graph execution (requiring sessions).
**API Simplification:** TF2.x removed deprecated APIs and reduced redundancy.
**Keras Integration:** Keras is the official high-level API in TF2.x.
**AutoGraph:** Converts Python code into TensorFlow graphs automatically in TF2.x.
**Distributed Training:** Simplified distributed training APIs in TF2.x.
**SavedModel:** The primary model export format in TF2.x.

What is a Tensor in TensorFlow?

A Tensor is the fundamental unit of data in TensorFlow. It's a multi-dimensional array (similar to NumPy arrays) that represents data flowing through the computation graph. Tensors have a data type (e.g., `float32`, `int32`) and a shape (the dimensions of the array).

What are the different ranks of Tensors? Give examples.

**Rank 0:** Scalar (e.g., `tf.constant(7)`)
**Rank 1:** Vector (e.g., `tf.constant([1, 2, 3])`)
**Rank 2:** Matrix (e.g., `tf.constant([[1, 2], [3, 4]])`)
**Rank 3:** Tensor with 3 dimensions (e.g., `tf.constant([[[1], [2]], [[3], [4]]])`)
...and so on for higher ranks.

What are the different data types supported by Tensors?

TensorFlow supports various data types, including:
- Floating-point types (`tf.float16`, `tf.float32`, `tf.float64`)
- Integer types (`tf.int8`, `tf.int16`, `tf.int32`, `tf.int64`, `tf.uint8`, `tf.uint16`)
- Boolean type (`tf.bool`)
- String type (`tf.string`)
- Complex number types (`tf.complex64`, `tf.complex128`)

What is a Constant in TensorFlow?

A Constant is a Tensor whose value cannot be changed after it is created. They are defined using `tf.constant()`.
```
import tensorflow as tf
x = tf.constant(10)
y = tf.constant([1, 2, 3])
```

What is a Variable in TensorFlow?

A Variable is a special kind of Tensor that can be modified during computation (e.g., during model training to hold model weights and biases). Variables must be explicitly initialized.
```
import tensorflow as tf
w = tf.Variable(tf.random.normal([3, 2]))
b = tf.Variable(tf.zeros([2]))
```

What is the difference between a Constant and a Variable?

**Constant:** Value is fixed after creation. Used for data that doesn't change during computation.
**Variable:** Value can be changed/updated during computation. Used for model parameters (weights, biases) that are learned during training.

What is Eager Execution in TensorFlow?

Eager Execution is an imperative programming environment in TensorFlow where operations are evaluated immediately when they are called. This makes development and debugging easier, as you can inspect the results of operations step by step, similar to NumPy. It is the default execution mode in TensorFlow 2.x.

What are the benefits of Eager Execution?

Easier to learn and use.
More intuitive debugging (can inspect intermediate tensor values).
Closer integration with Python control flow.
Faster development iteration.

What is a Computation Graph in TensorFlow?

A Computation Graph (or Dataflow Graph) is a way to represent a series of operations as nodes and the data (Tensors) flowing between them as edges. In TensorFlow 1.x, you had to explicitly build the graph and then run it within a Session. In TensorFlow 2.x, graphs are automatically generated from Python code using AutoGraph when you use `tf.function`. Graphs enable optimizations like distributed execution and model export.

What are the benefits of using Graphs (specifically with `tf.function`)?

**Performance:** Graphs can be optimized (e.g., kernel fusion, memory optimization).
**Portability:** Graphs can be saved and run without the original Python code (e.g., on mobile, in production).
**Distribution:** Graphs can be easily distributed across multiple devices or machines.
**Serialization:** Can be easily serialized and exported using SavedModel.

What is `tf.function`?

`tf.function` is a decorator in TensorFlow 2.x that compiles a Python function into a TensorFlow graph. This allows you to get the performance, portability, and scalability benefits of graphs while writing code in Python using Eager Execution.
```
@tf.function
def train_step(images, labels):
  # ... model training logic ...
  return loss
```

How does `tf.function` work (briefly)?

When a Python function decorated with `@tf.function` is called, TensorFlow traces the function's execution for specific input signatures (tensor shapes and dtypes). This tracing builds a concrete function (a callable graph). Subsequent calls with compatible signatures reuse the traced graph.

What is the difference between a `tf.Module` and a `tf.keras.Model`?

`tf.Module`: A base class for organizing TensorFlow code and variables. It allows you to group related variables and functions (`@tf.function`) together. It's a lower-level abstraction than `tf.keras.Model`.
`tf.keras.Model`: A higher-level class built on top of `tf.Module` specifically designed for building neural networks. It includes standard methods for training (`compile`, `fit`), evaluation (`evaluate`), prediction (`predict`), saving, and loading. It manages layers and their variables automatically.

What is Keras? Why is it used with TensorFlow?

Keras is a high-level API for building and training deep learning models. It's known for its user-friendliness, modularity, and ease of prototyping. Keras is the official high-level API recommended for TensorFlow 2.x, providing a simplified way to build complex models without dealing directly with low-level TensorFlow operations unless needed.

What is a Keras Layer? Give an example.

A Keras Layer is the fundamental building block of a Keras Model. It encapsulates a set of weights and biases and a computation (e.g., convolution, pooling, dense transformation).
```
import tensorflow as tf
from tensorflow import keras

dense_layer = keras.layers.Dense(units=64, activation='relu')
```

What are the two main ways to build Keras Models?

**Sequential API:** For building models where layers are stacked sequentially. Simple and easy for most common network architectures.

model = keras.Sequential([
    keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
])

**Functional API:** For building models with more complex or flexible architectures, including multi-input/multi-output models, shared layers, and models with non-sequential data flow.

inputs = keras.Input(shape=(784,))
x = keras.layers.Dense(128, activation='relu')(inputs)
x = keras.layers.Dropout(0.2)(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs)

When would you use the Keras Functional API over the Sequential API?

When your model is not a simple stack of layers.
When you need multiple inputs or multiple outputs.
When you need to share layers between different parts of the model.
When you need non-sequential connections between layers (e.g., skip connections in ResNet).

What is model compilation in Keras?

Model compilation is the step where you configure the learning process for your Keras model. You specify the optimizer, loss function, and metrics to be used during training.
```
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
```

What are the key components specified during Keras model compilation?

**Optimizer:** The algorithm used to update model weights during training (e.g., 'adam', 'sgd', 'rmsprop').
**Loss Function:** The function that measures the difference between the model's predictions and the true labels (e.g., 'categorical_crossentropy', 'mean_squared_error').
**Metrics:** Additional metrics used to evaluate the model's performance during training and evaluation (e.g., 'accuracy', 'precision').

What is the purpose of the `model.fit()` method in Keras?

`model.fit()` is used to train the Keras model. You provide the training data (features and labels), specify the number of epochs (iterations over the entire dataset), batch size, and potentially validation data.
```
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_val, y_val))
```

What is the purpose of the `model.evaluate()` method in Keras?

`model.evaluate()` is used to evaluate the model's performance on a given dataset (typically the test set) after training. It returns the loss and metric values specified during compilation.
```
loss, accuracy = model.evaluate(x_test, y_test)
```

What is the purpose of the `model.predict()` method in Keras?

`model.predict()` is used to generate predictions for new, unseen data using a trained model. It returns the model's output for the input data.
```
predictions = model.predict(new_data)
```

What is an Optimizer in TensorFlow? Give an example.

An Optimizer is an algorithm used during the training process to minimize the loss function by iteratively adjusting the model's parameters (weights and biases).
```
import tensorflow as tf
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
```

What is a Loss Function in TensorFlow? Give an example.

A Loss Function (or Objective Function) quantifies the difference between the model's predictions and the actual target values. The goal of training is to minimize this loss.
```
import tensorflow as tf
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
```

What is a Metric in TensorFlow? Give an example.

A Metric is a function used to evaluate the performance of the model. Unlike loss functions (which are used for optimization), metrics are used for reporting and interpretation.
```
import tensorflow as tf
metric = tf.keras.metrics.Accuracy()
```

What is Gradient Descent? How is it related to TensorFlow?

Gradient Descent is an iterative optimization algorithm used to find the minimum of a function (the loss function in machine learning). It works by taking steps proportional to the negative of the gradient of the function at the current point. TensorFlow provides various implementations of Gradient Descent and its variations (like Adam, RMSprop) as Optimizers, and it uses automatic differentiation to calculate the gradients efficiently.

What is Automatic Differentiation in TensorFlow?

Automatic Differentiation (autodiff) is a technique used by TensorFlow to compute the gradients of computations. This is essential for training neural networks using gradient-based optimization algorithms like Gradient Descent. TensorFlow records the operations performed during the forward pass and then uses this information to compute the gradients efficiently during the backward pass.

What is `tf.GradientTape`?

`tf.GradientTape` is an API in TensorFlow used to record operations for automatic differentiation. By default, operations inside a `tf.GradientTape` context are recorded. After recording, you can use the tape to compute the gradients of a target (usually the loss) with respect to a source (usually the model's variables).
```
with tf.GradientTape() as tape:
  predictions = model(inputs)
  loss = loss_fn(labels, predictions)

gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
```

What is TensorBoard? What is its purpose?

TensorBoard is a visualization toolkit for TensorFlow. It allows you to visualize various aspects of your machine learning workflow, including:
- Metrics (loss, accuracy) over time.
- Computation graphs.
- Model weights and biases distributions (histograms).
- Embeddings.
- Images.
- Profiling information.
It's crucial for monitoring training progress, debugging models, and understanding your data and model.

How do you integrate TensorBoard with Keras?

You can use the `tf.keras.callbacks.TensorBoard` callback during `model.fit()`. You specify a log directory where TensorBoard will write event files.
```
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir="./logs")
model.fit(x_train, y_train, epochs=10, callbacks=[tensorboard_callback])
```

What is a Callback in Keras? Give an example.

A Callback is an object that can perform actions at various stages of the training process (e.g., at the beginning/end of an epoch, before/after a batch). Callbacks are passed to the `model.fit()` method.

Examples: `ModelCheckpoint`, `EarlyStopping`, `TensorBoard`, `ReduceLROnPlateau`.

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)
model.fit(x_train, y_train, epochs=100, callbacks=[early_stopping])

What is the purpose of the `ModelCheckpoint` callback?

The `ModelCheckpoint` callback saves the model or model weights at certain points during training (e.g., after every epoch, or when a specific metric improves). This is useful for saving the best performing model or for resuming training later.

What is the purpose of the `EarlyStopping` callback?

The `EarlyStopping` callback stops training automatically when a monitored metric (e.g., validation loss) has stopped improving for a specified number of epochs (`patience`). This prevents overfitting and saves training time.

What is overfitting? How can TensorFlow/Keras help mitigate it?

Overfitting occurs when a model learns the training data too well, including the noise, and performs poorly on unseen data.
TensorFlow/Keras provides several techniques to mitigate overfitting:
- **Regularization:** L1, L2 regularization (penalizing large weights).
- **Dropout:** Randomly setting a fraction of neuron outputs to zero during training.
- **Early Stopping:** Stopping training when validation performance degrades.
- **Data Augmentation:** Creating new training examples by applying transformations to existing data.
- **Batch Normalization:** Helps stabilize training and can act as a regularizer.

What is underfitting? How can TensorFlow/Keras help mitigate it?

Underfitting occurs when a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and unseen data.
To mitigate underfitting:
- Use a more complex model (add more layers or neurons).
- Train for more epochs.
- Use a more appropriate model architecture for the problem.
- Improve data quality or add more relevant features.

What is Batch Normalization? What are its benefits?

Batch Normalization is a technique that normalizes the outputs of a layer by subtracting the batch mean and dividing by the batch standard deviation.
Benefits:
- Helps stabilize training, allowing for higher learning rates.
- Reduces the sensitivity to the initial weights.
- Acts as a form of regularization, reducing the need for Dropout in some cases.
- Speeds up training.

What is Dropout? What is its purpose?

Dropout is a regularization technique where, during training, a random percentage of neurons in a layer are temporarily ignored (their outputs are set to zero). This prevents neurons from becoming too co-dependent and forces the network to learn more robust features, reducing overfitting. Dropout is typically only applied during training, not during inference.

What is a Learning Rate? Why is it important?

The Learning Rate is a hyperparameter that controls the step size during the optimization process (e.g., Gradient Descent). It determines how much the model's weights are adjusted in response to the estimated error each time the weights are updated.
Importance:
- A high learning rate can cause the optimization to overshoot the minimum and oscillate.
- A low learning rate can make training very slow.
- Finding an appropriate learning rate is crucial for efficient and effective training.

What is a Learning Rate Schedule?

A Learning Rate Schedule is a technique where the learning rate is changed during training according to a predefined schedule or based on the training progress. Common schedules include decreasing the learning rate over time (e.g., step decay, exponential decay) or reducing it when a metric stops improving (`ReduceLROnPlateau` callback). This can help the model converge better, especially in the later stages of training.

What is Transfer Learning? How can TensorFlow help?

Transfer Learning is a technique where a model trained on one task (e.g., image classification on a large dataset like ImageNet) is reused as a starting point for training a model on a related task. You typically use the pre-trained model's layers (especially the initial layers) and train only the later layers or add new layers for the new task.

TensorFlow/Keras provides access to many pre-trained models (e.g., VGG16, ResNet50, MobileNet) that can be easily loaded and used for transfer learning. You can load a pre-trained model, freeze some layers, and train the rest.

base_model = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3), include_top=False, weights='imagenet')
base_model.trainable = False # Freeze the base model

inputs = tf.keras.Input(shape=(224, 224, 3))
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense(num_classes)(x)

model = tf.keras.Model(inputs, outputs)

What is Fine-tuning in the context of Transfer Learning?

Fine-tuning is a step in transfer learning where, after training the new layers on top of a pre-trained model, you unfreeze some of the layers in the pre-trained base model (usually the later layers) and continue training the entire model (or parts of it) with a very low learning rate. This allows the model to adapt the pre-trained weights to the specific new task and dataset.

What is a Custom Training Loop in TensorFlow? When would you use one?

A Custom Training Loop is when you write the training logic yourself using `tf.GradientTape` to compute gradients and an optimizer to apply them, instead of using the built-in `model.fit()` method.
You would use a custom training loop when:
- You need more control over the training process than `model.fit()` provides.
- You are implementing a complex model or training algorithm that isn't standard (e.g., GANs, reinforcement learning).
- You need fine-grained control over gradient computation or application.

What is `tf.data`? What is its purpose?

`tf.data` is an API in TensorFlow used for building efficient and scalable data pipelines. It allows you to handle large datasets, perform transformations (mapping, batching, shuffling, etc.), and optimize data loading for training.
```
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.shuffle(1000).batch(32).prefetch(tf.data.AUTOTUNE)
```

What are the common operations you can perform with `tf.data` datasets?

`map()`: Apply a function to each element in the dataset.
`filter()`: Keep only elements that satisfy a condition.
`batch()`: Combine consecutive elements into batches.
`shuffle()`: Randomly shuffle the elements.
`repeat()`: Repeat the dataset multiple times.
`prefetch()`: Overlaps data preprocessing and model execution.
`cache()`: Caches elements in memory or a local file.

What is the purpose of `dataset.prefetch(tf.data.AUTOTUNE)`?

`prefetch()` overlaps the data preprocessing and model execution steps. While the model is training on the current batch, the data pipeline is preparing the next batch in the background. `tf.data.AUTOTUNE` allows TensorFlow to automatically determine the optimal number of batches to prefetch based on available CPU and device resources, helping to keep the GPU/TPU busy and improving training performance.

What is `tf.Estimator`? (Note: Primarily for TF1.x, less used in TF2.x)

`tf.Estimator` was a high-level API in TensorFlow 1.x for simplified distributed training. It abstracted away much of the low-level details of sessions and distributed setups. While still available in TF2.x (via `tf.compat.v1.estimator`), Keras and `tf.distribute.Strategy` are the recommended high-level APIs for distributed training in TF2.x.

What is `tf.distribute.Strategy`?

`tf.distribute.Strategy` is the primary API in TensorFlow 2.x for distributing training across multiple GPUs, multiple machines, or TPUs. It provides a unified way to distribute computation and variables.

Name some common `tf.distribute.Strategy` implementations.

`MirroredStrategy`: For single-host, multi-GPU synchronous training. Copies variables to each GPU and mirrors operations.
`MultiWorkerMirroredStrategy`: For multi-host, multi-GPU synchronous training.
`TPUStrategy`: For training on Tensor Processing Units (TPUs).
`OneDeviceStrategy`: For placing all variables and computation on a single device (e.g., a single GPU or CPU). Useful for testing distribution code locally.

How do you use `tf.distribute.Strategy` with Keras?

You create a distribution strategy instance and then create your Keras model *within* the strategy's scope. The strategy will automatically handle distributing the model and training across the specified devices/workers.
```
strategy = tf.distribute.MirroredStrategy()

with strategy.scope():
  model = build_keras_model() # Build your model here
  model.compile(...)

model.fit(...)
```

What is a TPU (Tensor Processing Unit)? How is it different from a GPU?

A TPU is an Application-Specific Integrated Circuit (ASIC) developed by Google specifically for accelerating machine learning workloads, particularly matrix multiplication operations which are common in neural networks.
Differences from GPUs:
- TPUs are designed for high-throughput, low-precision computation.
- TPUs excel at large matrix operations but are less flexible than GPUs for general-purpose computing.
- TPUs often require data to be processed in larger batches and have different memory access patterns than GPUs.
- TPUs are primarily used via cloud platforms (like Google Cloud TPU).

What is SavedModel? Why is it important?

SavedModel is the universal serialization format for TensorFlow models. It contains the complete TensorFlow program, including weights and computation graph(s).
Importance:
- Allows you to save and load models independent of the original code that created them.
- Enables deployment across various platforms (TensorFlow Serving, TensorFlow Lite, TensorFlow.js).
- Supports multiple signatures for different tasks (e.g., training, inference).

How do you save and load a Keras model using SavedModel?

Save:
```
model.save('my_model')
```

Load:

loaded_model = tf.keras.models.load_model('my_model')

What is TensorFlow Serving?

TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments. It can serve SavedModel models via gRPC or REST APIs. It supports model versioning and enables seamless model updates without downtime.

What is TensorFlow Lite?

TensorFlow Lite is a lightweight library for deploying TensorFlow models on edge devices (mobile, microcontrollers, IoT devices). It uses a specially optimized format (`.tflite`) and provides interpreters for various platforms, enabling on-device machine learning.

What is TensorFlow.js?

TensorFlow.js is a library for training and deploying machine learning models in JavaScript, directly within a web browser or in Node.js. It enables interactive ML experiences in the browser and can also run models on the server-side using Node.js.

What is the difference between `tf.constant(value)` and `tf.convert_to_tensor(value)`?

`tf.constant(value)`: Creates a constant tensor from the given `value`.
`tf.convert_to_tensor(value)`: Converts the given `value` (which can be a NumPy array, Python list, scalar, or another Tensor) into a Tensor. It's often used within functions where you need to ensure the input is a Tensor.

What is the purpose of `tf.cast()`?

`tf.cast()` is used to change the data type of a Tensor.

x = tf.constant([1, 2, 3], dtype=tf.int32)
y = tf.cast(x, dtype=tf.float32) # y is tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32)

What is the purpose of `tf.reshape()`?

`tf.reshape()` is used to change the shape of a Tensor without changing its data. The new shape must have the same total number of elements as the original tensor.
```
x = tf.constant([1, 2, 3, 4, 5, 6])
y = tf.reshape(x, [2, 3])
# y is tf.Tensor([[1 2 3], [4 5 6]], shape=(2, 3), dtype=int32)
```

What is the purpose of `tf.squeeze()`?

`tf.squeeze()` removes dimensions of size 1 from the shape of a Tensor.

x = tf.constant([[[1], [2], [3]]]) # shape=(1, 3, 1)
y = tf.squeeze(x) # shape=(3,)
# y is tf.Tensor([1 2 3], shape=(3,), dtype=int32)

What is the purpose of `tf.expand_dims()`?

`tf.expand_dims()` adds a dimension of size 1 to the shape of a Tensor at a specified axis.

x = tf.constant([1, 2, 3]) # shape=(3,)
y = tf.expand_dims(x, axis=0) # shape=(1, 3)
# y is tf.Tensor([[1 2 3]], shape=(1, 3), dtype=int32)

What is the purpose of `tf.transpose()`?

`tf.transpose()` permutes the dimensions of a Tensor according to a specified permutation. If no permutation is given, it reverses the dimensions.
```
x = tf.constant([[1, 2], [3, 4]]) # shape=(2, 2)
y = tf.transpose(x) # shape=(2, 2)
# y is tf.Tensor([[1 3], [2 4]], shape=(2, 2), dtype=int32)
```

What is the difference between `tf.matmul()` and `tf.multiply()`?

`tf.matmul(a, b)`: Performs matrix multiplication of tensors `a` and `b`.
`tf.multiply(a, b)`: Performs element-wise multiplication of tensors `a` and `b`.

a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])

matmul_result = tf.matmul(a, b) # [[19, 22], [43, 50]]
multiply_result = tf.multiply(a, b) # [[5, 12], [21, 32]]

What is Broadcasting in TensorFlow?

Broadcasting is a mechanism in TensorFlow (similar to NumPy) that allows operations between tensors with different shapes. When performing an operation, TensorFlow attempts to "broadcast" the smaller tensor's shape to match the larger tensor's shape by copying elements. This is possible if the dimensions are compatible (either they are equal or one of them is 1).

What is the purpose of `tf.reduce_sum()`?

`tf.reduce_sum()` computes the sum of elements across dimensions of a tensor. You can specify which dimensions to reduce.

x = tf.constant([[1, 2], [3, 4]])
sum_all = tf.reduce_sum(x) # 10
sum_rows = tf.reduce_sum(x, axis=0) # [4, 6]
sum_cols = tf.reduce_sum(x, axis=1) # [3, 7]

What is the purpose of `tf.argmax()` and `tf.argmin()`?

`tf.argmax()` returns the index with the largest value across a specified dimension of a tensor.

`tf.argmin()` returns the index with the smallest value across a specified dimension of a tensor.

x = tf.constant([[1, 5], [3, 2]])
argmax_cols = tf.argmax(x, axis=0) # [1, 0] (index 1 in col 0 has value 3, index 0 in col 1 has value 5)
argmax_rows = tf.argmax(x, axis=1) # [1, 0] (index 1 in row 0 has value 5, index 0 in row 1 has value 3)

What is the purpose of `tf.one_hot()`?

`tf.one_hot()` creates a one-hot encoding tensor from indices.

indices = [0, 2, 1]
depth = 3
one_hot_tensor = tf.one_hot(indices, depth)
# [[1., 0., 0.], [0., 0., 1.], [0., 1., 0.]]

What is the purpose of `tf.concat()`?

`tf.concat()` concatenates tensors along a specified dimension.

x = tf.constant([[1, 2], [3, 4]])
y = tf.constant([[5, 6]])
concat_rows = tf.concat([x, y], axis=0) # [[1, 2], [3, 4], [5, 6]]
concat_cols = tf.concat([x, tf.transpose(y)], axis=1) # [[1, 2, 5], [3, 4, 6]]

What is the purpose of `tf.stack()`?

`tf.stack()` stacks a list of rank-R tensors into a rank-(R+1) tensor along a new dimension.

x = tf.constant([1, 2])
y = tf.constant([3, 4])
stacked = tf.stack([x, y]) # [[1, 2], [3, 4]] (shape=(2, 2))

What is the difference between `tf.concat()` and `tf.stack()`?

`tf.concat()` joins tensors along an *existing* dimension. The rank of the resulting tensor is the same as the input tensors.
`tf.stack()` joins tensors along a *new* dimension. The rank of the resulting tensor is one greater than the input tensors.

What is a tf.RaggedTensor? When would you use it?

A `tf.RaggedTensor` is a tensor with one or more dimensions that have non-uniform lengths (i.e., "ragged" dimensions).
You would use it to represent sequences of varying lengths, such as text sentences, lists of features, or variable-sized batches of data, without padding. This can save memory and computation compared to padding to the maximum length.

What is a tf.SparseTensor? When would you use it?

A `tf.SparseTensor` is a tensor that efficiently represents data where most elements are zero. It stores only the non-zero values, their indices, and the dense shape of the tensor.
You would use it for data like:
- One-hot encoded vectors for categorical features with a large vocabulary.
- Text data represented as bag-of-words or TF-IDF vectors.
- Graph data represented as adjacency matrices.

What is the purpose of `tf.lookup`?

`tf.lookup` provides operations for looking up values in tables (like hash tables or initializable tables). This is useful for mapping strings (e.g., words in a vocabulary) to integer IDs, which is common in natural language processing.

What is `tf.train.Checkpoint`?

`tf.train.Checkpoint` is used for saving and restoring the state of TensorFlow objects (like `tf.Variable`, `tf.Module`, `tf.keras.Model`, `tf.Optimizer`, `tf.data.Dataset.Iterator`). It tracks dependencies between these objects and saves/restores them correctly. It's the recommended way to save state for custom training loops.

What is the difference between saving a model with `model.save()` (SavedModel) and `tf.train.Checkpoint`?

`model.save()` (SavedModel): Saves the entire model structure (graph) and weights. It's designed for export and serving. Can be loaded without the original code that built the model.
`tf.train.Checkpoint`: Saves the *state* of specified TensorFlow objects (variables, optimizer state, etc.). It's designed for saving and restoring training progress. Requires the original model code to load and restore the state correctly.

What is `tf.config`?

`tf.config` is an API used for configuring the TensorFlow runtime, such as listing available physical and logical devices (GPUs, CPUs), setting memory limits, or enabling/disabling certain optimizations.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.set_visible_devices(gpus[0], 'GPU') # Use only the first GPU
    tf.config.experimental.set_memory_growth(gpus[0], True) # Allow memory growth
  except RuntimeError as e:
    print(e)

How do you manage GPU memory in TensorFlow?

By default, TensorFlow might allocate most or all of the GPU memory upfront. You can configure memory management using `tf.config`:
- `tf.config.experimental.set_memory_growth(device, True)`: Allows TensorFlow to allocate memory dynamically as needed, rather than reserving it all at once. Recommended to prevent out-of-memory errors when running multiple processes on the same GPU.
- `tf.config.set_logical_device_configuration(device, [tf.config.LogicalDeviceConfiguration(memory_limit=...)])`: Sets a specific memory limit for a logical device.

What are Logical Devices in TensorFlow?

Logical Devices are an abstraction over physical devices. TensorFlow creates one logical device per physical device by default. However, you can configure TensorFlow to create multiple logical devices on a single physical device (e.g., to partition GPU memory) or to map multiple physical devices to a single logical device (e.g., for MirroredStrategy).

What is the purpose of `tf.debugging`?

`tf.debugging` provides APIs for debugging TensorFlow code, including:
- `tf.debugging.check_numerics()`: Checks for NaN or Inf values in tensors.
- Assertions (`tf.debugging.assert_equal`, `tf.debugging.assert_less`, etc.): Insert runtime checks into your graph.
- Debugger V2: A command-line debugger for inspecting tensor values and graph execution.

What is the difference between Static Shape and Dynamic Shape of a Tensor?

**Static Shape:** The shape of a Tensor that is known *before* the graph is executed. This information is available at graph construction time. For example, when you define an input layer with a fixed `input_shape`.
**Dynamic Shape:** The shape of a Tensor that is only known *during* graph execution. This often happens with variable-length inputs or operations whose output shape depends on the input data values.

How do you handle dynamic shapes in TensorFlow?

TensorFlow operations are designed to work with dynamic shapes. You can use operations like `tf.shape()`, `tf.size()`, `tf.rank()` to get shape information dynamically during execution. When defining Keras layers, you can sometimes use `None` in the `input_shape` to indicate a dynamic dimension (e.g., the batch size).

What is the purpose of `tf.Module`? (Refined answer)

`tf.Module` is a base class in TensorFlow for structuring your code and managing variables. It allows you to group related variables and `tf.function`-decorated methods together. It's primarily used when building custom models or components at a lower level than Keras, providing a way to make your custom logic stateful (by holding variables) and savable/restorable using `tf.train.Checkpoint`.

What is the purpose of `tf.keras.Model.add_loss()`?

`tf.keras.Model.add_loss()` allows you to add arbitrary loss terms to your model that are not directly related to the output of the model's forward pass (e.g., regularization losses, losses from auxiliary tasks). These losses are added to the main loss during training.

What is the purpose of `tf.keras.Model.add_metric()`?

`tf.keras.Model.add_metric()` allows you to track arbitrary metrics during training and evaluation that are not automatically calculated by Keras (e.g., custom metrics, tracking the mean of activations).

What is the difference between a custom Layer and a custom Model in Keras?

**Custom Layer:** You subclass `tf.keras.layers.Layer` to define a new computational building block. You implement `build()` (to create weights) and `call()` (to define the forward pass computation). Layers are designed to be composable and used within models.
**Custom Model:** You subclass `tf.keras.Model` to create a complete model. Models are typically composed of layers (built-in or custom). You implement `call()` to define the forward pass of the entire model. Models have built-in training, evaluation, and saving methods (`compile`, `fit`, `evaluate`, `predict`, `save`).

How do you create a custom Keras Layer?

Subclass `tf.keras.layers.Layer`.
Implement the `build(self, input_shape)` method: This is where you create the layer's weights using `self.add_weight()`. It's called automatically the first time the layer is used, based on the input shape.
Implement the `call(self, inputs)` method: This is where you define the layer's forward pass computation using TensorFlow operations.

class MyDenseLayer(tf.keras.layers.Layer):
    def __init__(self, units, activation=None):
        super(MyDenseLayer, self).__init__()
        self.units = units
        self.activation = tf.keras.activations.get(activation)

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='zeros',
                                 trainable=True)

    def call(self, inputs):
        output = tf.matmul(inputs, self.w) + self.b
        if self.activation is not None:
            output = self.activation(output)
        return output

How do you create a custom Keras Model?

Subclass `tf.keras.Model`.
In the `__init__` method, define the layers or sub-modules that the model will use.

Implement the `call(self, inputs)` method: This defines the forward pass logic, connecting the layers defined in `__init__`.

class MyModel(tf.keras.Model):
    def __init__(self, num_classes):
        super(MyModel, self).__init__()
        self.dense1 = tf.keras.layers.Dense(128, activation='relu')
        self.dropout = tf.keras.layers.Dropout(0.2)
        self.dense2 = tf.keras.layers.Dense(num_classes, activation='softmax')

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training) # Dropout needs training flag
        return self.dense2(x)

What is the purpose of the `training` argument in the `call` method of Keras Layers/Models?

The `training` argument is a boolean flag that indicates whether the layer/model is currently in training mode (`True`) or inference mode (`False`). This is necessary for layers that behave differently during training and inference, such as `Dropout` and `BatchNormalization`.

What is the recommended way to save and load custom Keras objects (Layers, Models)?

Custom layers and models that follow the standard Keras patterns (implementing `__init__`, `build`, `call`) can generally be saved and loaded using `model.save()` and `tf.keras.models.load_model()`. If your custom objects have custom logic or state management, you might need to implement `get_config()` and `from_config()` methods for serialization.

What is the purpose of `tf.summary`?

`tf.summary` provides APIs for writing summary data (scalars, images, histograms, etc.) that can be visualized in TensorBoard. It's often used within custom training loops to log metrics and other information. The Keras `TensorBoard` callback uses `tf.summary` internally.

How do you record summaries in a custom training loop?

Use `tf.summary.create_file_writer()` to create a summary writer, specifying the log directory.
Use `with writer.as_default():` to set the default writer.

Use summary operations (e.g., `tf.summary.scalar()`, `tf.summary.image()`) within a `tf.function` or `tf.GradientTape` context to record data. Specify a `step` argument (e.g., the training step number) to plot data over time.

log_dir = "logs/custom_train"
writer = tf.summary.create_file_writer(log_dir)

@tf.function
def train_step(...):
  with tf.GradientTape() as tape:
    # ... compute loss ...

  with writer.as_default():
    tf.summary.scalar('train_loss', loss, step=optimizer.iterations)
  # ... apply gradients ...

What is the purpose of `tf.lookup.StaticVocabularyTable`?

`tf.lookup.StaticVocabularyTable` is a lookup table that maps strings (e.g., words) to integer IDs based on a predefined vocabulary file. It's static because the vocabulary does not change during training. It's commonly used in NLP for tokenization and mapping tokens to embedding indices.

What is the purpose of `tf.strings`?

`tf.strings` provides operations for manipulating string tensors, such as splitting strings, joining strings, converting strings to numbers, and performing string length checks. It's useful for preprocessing text data within TensorFlow.

What is the purpose of `tf.io`?

`tf.io` provides operations for working with various data formats and filesystems, including:
- Reading and writing TFRecord files (a standard format for storing data in TensorFlow).
- Decoding images (JPEG, PNG).
- Working with Google Cloud Storage (GCS) or other filesystems.

What is a TFRecord file? Why use it?

A TFRecord file is a simple record-oriented binary format for storing sequences of binary records. In TensorFlow, these records are typically serialized `tf.Example` protocol buffers, which can store various types of data (features, labels, images, etc.).
Benefits:
- Efficient for storing large datasets.
- Can be read sequentially, which is efficient for streaming data during training.
- Supports various data types within a single file.
- Optimized for use with `tf.data`.

What is a `tf.Example`?

A `tf.Example` is a standard protocol buffer message used within TFRecord files to represent a single training or inference example. It contains a dictionary of features, where each feature can hold a list of bytes, floats, or integers.

How do you read data from TFRecord files using `tf.data`?

Use `tf.data.TFRecordDataset()` to create a dataset that reads from one or more TFRecord files.

Use `dataset.map()` with `tf.io.parse_single_example()` to parse the serialized `tf.Example` protocol buffers into tensors. You need to define the feature description (mapping feature names to data types and shapes).

feature_description = {
    'image_raw': tf.io.FixedLenFeature([], tf.string),
    'label': tf.io.FixedLenFeature([], tf.int64),
}

def _parse_example_function(example_proto):
  return tf.io.parse_single_example(example_proto, feature_description)

dataset = tf.data.TFRecordDataset(['my_data.tfrecord'])
dataset = dataset.map(_parse_example_function)

What is the purpose of `tf.keras.preprocessing`? (Note: Largely superseded by Keras preprocessing layers in TF2.x)

`tf.keras.preprocessing` was a module (more prominent in TF1.x) providing utility functions for data preprocessing, such as generating image batches from directories (`ImageDataGenerator`) or tokenizing text (`Tokenizer`). In TF2.x, the recommended approach is to use the Keras preprocessing layers (`tf.keras.layers.Preprocessing`) or `tf.data` transformations for building end-to-end pipelines.

What are Keras Preprocessing Layers? Give an example.

Keras Preprocessing Layers are a set of layers (`tf.keras.layers.Preprocessing`) that perform data preprocessing operations (e.g., normalization, text vectorization, image augmentation) directly within the Keras model or as a separate preprocessing step. They can be exported as part of the SavedModel.

# Text preprocessing
vectorize_layer = tf.keras.layers.TextVectorization(max_tokens=10000, output_sequence_length=250)
text_dataset = tf.data.Dataset.from_tensor_slices(["hello world", "tensorflow is great"])
vectorize_layer.adapt(text_dataset.batch(64)) # Learn the vocabulary

# Image preprocessing
normalization_layer = tf.keras.layers.Normalization(axis=-1)
normalization_layer.adapt(image_dataset.map(lambda x, y: x).batch(64)) # Learn mean and variance

What is the purpose of `layer.adapt()`?

The `adapt()` method in Keras preprocessing layers is used to fit the state of the preprocessing layer to the data. For example, a `TextVectorization` layer uses `adapt()` to build its vocabulary from the training data. A `Normalization` layer uses `adapt()` to compute the mean and variance of the data.

How do you perform data augmentation in TensorFlow/Keras?

Using Keras Preprocessing Layers: `tf.keras.layers.RandomFlip`, `tf.keras.layers.RandomRotation`, etc., added to the model or data pipeline.
Using `tf.data` transformations: Writing custom functions using TensorFlow image operations (`tf.image`) and applying them with `dataset.map()`.

What is the purpose of `tf.keras.mixed_precision`?

`tf.keras.mixed_precision` is an API that enables the use of mixed precision training, which involves using both `float16` and `float32` data types during training. This can significantly speed up training on GPUs and TPUs by leveraging hardware capabilities optimized for `float16`, while maintaining numerical stability by using `float32` for certain operations (like gradients).

How do you enable mixed precision training in Keras?

Use `tf.keras.mixed_precision.set_global_policy('mixed_float16')` at the beginning of your script. This changes the default dtype for layers to `mixed_float16` and automatically inserts necessary casts.

What is the purpose of `tf.lookup.StaticHashTable`?

`tf.lookup.StaticHashTable` is a lookup table that maps keys to values based on a predefined dictionary or list. It's static because the table contents do not change after creation. It can be used for mapping any tensor keys (like strings or integers) to any tensor values.

What is the difference between `tf.gather()` and `tf.IndexedSlices`?

`tf.gather(params, indices)`: Gathers slices from `params` at specified `indices`. Returns a dense tensor. This operation is typically used when the indices are few and non-contiguous.
`tf.IndexedSlices`: A sparse representation used internally by TensorFlow, particularly when computing gradients for operations that involve gathering slices (like embedding lookups). It represents a sparse tensor by storing `values`, `indices`, and `dense_shape`. It's not a standard tensor type you typically create directly, but you might encounter it when working with gradients.

What is the purpose of `tf.nn`?

`tf.nn` is a module in TensorFlow that contains common neural network operations, such as activation functions (`relu`, `sigmoid`, `softmax`), convolution operations (`conv2d`), pooling operations (`max_pool`), batch normalization, and loss functions. Keras layers often use these operations internally.

What are activation functions? Give examples from `tf.nn`.

Activation functions introduce non-linearity into neural networks, allowing them to learn complex patterns. They are applied to the output of layers.
Examples from `tf.nn`: `tf.nn.relu`, `tf.nn.sigmoid`, `tf.nn.softmax`, `tf.nn.tanh`.

What is the purpose of `tf.image`?

`tf.image` provides operations for image processing, such as resizing, cropping, flipping, adjusting brightness/contrast, and encoding/decoding image formats. It's useful for image preprocessing and augmentation within TensorFlow data pipelines.

What is the purpose of `tf.linalg`?

`tf.linalg` provides operations for linear algebra, such as matrix inversion, determinant calculation, eigenvalue decomposition, and solving linear systems.

What is the purpose of `tf.signal`?

`tf.signal` provides operations for signal processing, such as Fourier transforms, convolutions, and windowing functions. Useful for working with audio or time-series data.

What is the purpose of `tf.random`?

`tf.random` provides operations for generating random numbers and shuffling tensors. It's used for initializing weights, shuffling datasets, and implementing dropout.
```
random_tensor = tf.random.normal(shape=(2, 3), mean=0.0, stddev=1.0)
shuffled_tensor = tf.random.shuffle(input_tensor)
```

What is the importance of seeding random operations?

Seeding random operations (using `tf.random.set_seed()`) ensures reproducibility. If you run your code multiple times with the same seed, the random numbers generated will be the same, leading to the same initial weights and data shuffling, which helps in debugging and comparing experiments.

What is the purpose of `tf.debugging.check_numerics()`? (Duplicate, but important)

`tf.debugging.check_numerics(tensor, message)` is used to check a tensor for `NaN` (Not a Number) or `Inf` (Infinity) values during execution. If it finds any, it raises an error with the specified message. This is invaluable for debugging training issues where the loss or gradients explode.

What are Assertions in `tf.debugging`?

Assertions in `tf.debugging` (e.g., `tf.debugging.assert_equal`, `tf.debugging.assert_rank`) are operations that insert runtime checks into your graph. If an assertion fails during execution, it raises an error. They are useful for verifying tensor properties (like shape or values) during development or in `tf.function`s.

What is the TensorFlow Profiler?

The TensorFlow Profiler is a tool used to measure and analyze the performance of your TensorFlow code on various devices (CPU, GPU, TPU). It helps identify bottlenecks in your input pipeline, model execution, and device utilization, allowing you to optimize performance. You can access it via TensorBoard.

How do you use the TensorFlow Profiler?

Enable profiling in your code (e.g., using `tf.profiler.experimental.start()` and `stop()`, or the Keras `TensorBoard` callback with `profile_batch`).
Run your code.
Open TensorBoard and navigate to the "Profile" tab.
Analyze the reports (step time, device utilization, input pipeline analysis, etc.).

What is the purpose of `tf.lookup.TextFileInitializer`?

`tf.lookup.TextFileInitializer` is used to initialize a lookup table from a text file where each line contains a key and optionally a value. It's commonly used to initialize vocabulary tables from a file containing words.

What is the difference between `tf.keras.layers.RNN` and `tf.keras.layers.LSTM`/`tf.keras.layers.GRU`?

`tf.keras.layers.RNN`: A base class for recurrent layers. It can wrap a user-defined RNN cell (`tf.keras.layers.AbstractRNNCell`) to process sequences.
`tf.keras.layers.LSTM`: A concrete implementation of a Long Short-Term Memory (LSTM) layer, which is a type of RNN layer with internal gates designed to mitigate the vanishing gradient problem.
`tf.keras.layers.GRU`: A concrete implementation of a Gated Recurrent Unit (GRU) layer, another type of RNN layer that is simpler than LSTM but also effective at handling sequences.
You typically use `LSTM` or `GRU` directly rather than building a layer with the base `RNN` class unless implementing a custom RNN cell.

What is the purpose of `return_sequences=True` in RNN layers?

When `return_sequences=True`, the RNN layer returns the output for each timestep in the sequence. The output shape will typically be `(batch_size, timesteps, units)`. This is needed when stacking multiple RNN layers or when using the output sequence for tasks like sequence-to-sequence modeling.
When `return_sequences=False` (default), the RNN layer returns only the output from the last timestep. The output shape will typically be `(batch_size, units)`. This is common when the RNN is followed by dense layers for classification or regression.

What is an Embedding Layer (`tf.keras.layers.Embedding`)? When would you use it?

An Embedding Layer is used to convert sparse representations (like integer IDs representing words or categories) into dense, fixed-size vectors (embeddings). It's essentially a lookup table where each ID is associated with a learnable vector.
You would use it for:
- Natural Language Processing (NLP): Mapping word IDs to word embeddings.
- Categorical Features: Mapping categorical feature IDs to embeddings.

What is the purpose of `mask_zero=True` in an Embedding Layer?

When `mask_zero=True`, the Embedding layer will treat the index 0 as a special "padding" value and mask out timesteps with this index. This is useful when working with variable-length sequences that have been padded with zeros to a uniform length. Masking ensures that padding values do not contribute to the computation in subsequent layers (like RNNs or attention layers).

What is the purpose of `tf.keras.backend`? (Note: Lower-level, less common in TF2.x high-level code)

`tf.keras.backend` provides a set of backend-agnostic functions for common tensor operations (like `mean`, `sum`, `square`). In TensorFlow 2.x, you typically use the standard `tf.*` operations directly instead of `tf.keras.backend.*` because Keras is integrated with TensorFlow, but it exists for compatibility and advanced use cases.

What is the difference between `tf.keras.losses.CategoricalCrossentropy` and `tf.keras.losses.SparseCategoricalCrossentropy`?

`CategoricalCrossentropy`: Used when the target labels are in one-hot encoded format (e.g., `[0, 0, 1]` for class 2). Requires the model output to be probabilities (often from a softmax activation).
`SparseCategoricalCrossentropy`: Used when the target labels are integers (e.g., `2` for class 2). Requires the model output to be logits (raw outputs before softmax) or probabilities (if `from_logits=False`). It's often preferred as it saves memory and computation by not requiring one-hot encoding of labels.

What is the purpose of `from_logits=True` in loss functions?

When `from_logits=True`, the loss function expects the model's raw output (logits) as input and applies a softmax activation internally for calculating the loss. This is numerically more stable than applying softmax in the model's last layer and then passing the probabilities to a loss function that expects probabilities.

What is the purpose of `tf.lookup.StaticVocabularyTable`? (Duplicate, but important)

`tf.lookup.StaticVocabularyTable` maps strings to integer IDs using a predefined vocabulary. It's static and typically initialized from a file or list. Essential for converting text tokens into numerical representations for embedding layers.

What is the purpose of `tf.data.Dataset.from_generator()`?

`tf.data.Dataset.from_generator()` creates a dataset from a Python generator function. This is useful when your data loading or preprocessing logic is complex and cannot be easily expressed using standard `tf.data` transformations, or when you need to interface with external libraries.

How do you handle variable-length sequences in TensorFlow/Keras?

**Padding:** Pad sequences to a maximum length (using `tf.keras.preprocessing.sequence.pad_sequences` or `dataset.padded_batch()`). Use `mask_zero=True` in Embedding/RNN layers if padding with 0.
**Ragged Tensors:** Use `tf.RaggedTensor` to represent sequences without padding. Layers like `tf.keras.layers.LSTM` can directly accept `RaggedTensor` inputs.
**Masking:** Use Keras Masking layers (`tf.keras.layers.Masking`) or the `mask` argument in `call()` methods to propagate masking information through the model.

What is the purpose of `tf.keras.layers.Masking`?

`tf.keras.layers.Masking` is a layer that masks out timesteps in a sequence based on a specified mask value (default is 0). It generates a mask tensor that is propagated to subsequent layers, indicating which timesteps should be ignored during computation (e.g., in RNNs or pooling layers). This is typically used with padded sequences.

What is the purpose of `tf.keras.backend.clear_session()`?

`tf.keras.backend.clear_session()` clears the Keras backend state. This is useful when running multiple training sessions or experiments in the same process (like in a Jupyter notebook) to avoid interference between runs and ensure that each new model starts with a clean state.

What is the purpose of `tf.random.uniform()` and `tf.random.normal()`?

`tf.random.uniform(shape, minval=0, maxval=None, dtype=tf.float32)`: Generates random numbers from a uniform distribution within the specified range.
`tf.random.normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32)`: Generates random numbers from a normal (Gaussian) distribution with the specified mean and standard deviation.
These are commonly used for initializing model weights.

What is the purpose of `tf.math.reduce_mean()` and `tf.math.reduce_sum()`?

These are aliases for `tf.reduce_mean()` and `tf.reduce_sum()`, providing common mathematical reduction operations. `tf.math` contains a variety of mathematical operations on tensors.

What is the purpose of `tf.where()`?

`tf.where(condition, x=None, y=None)` returns elements chosen from `x` or `y` depending on `condition`. If `x` and `y` are not provided, it returns the coordinates of `True` elements in `condition`. It's useful for conditional logic on tensors.
```
condition = tf.constant([True, False, True])
x = tf.constant([1, 2, 3])
y = tf.constant([10, 20, 30])
result = tf.where(condition, x, y) # [1, 20, 3]
```

What is the purpose of `tf.debugging.assert_shapes()`?

`tf.debugging.assert_shapes(tensors, expected_shapes)` asserts that the shapes of the given tensors match the expected shapes. This is a powerful assertion for verifying tensor shapes at runtime, especially within `tf.function`s. You can use `None` for unknown dimensions.
```
tf.debugging.assert_shapes([input_tensor], ['B, 28, 28, 1']) # B is batch size
```

What is the difference between `tf.keras.Model.fit()` and a custom training loop using `tf.GradientTape`? (Refined)

`model.fit()`: High-level, opinionated method. Handles batching, shuffling, callbacks, distributed training (with strategy scope), verbose output, and standard training procedures automatically. Less flexible for non-standard training.
Custom Training Loop: Low-level, highly flexible. Requires manual implementation of batching, shuffling, iterating over epochs, computing gradients with `tf.GradientTape`, applying gradients with an optimizer, and logging metrics/summaries. Provides maximum control for complex scenarios.

What is the purpose of `tf.keras.layers.Lambda`?

`tf.keras.layers.Lambda` is a layer that wraps an arbitrary expression as a Layer object. It's useful for inserting simple, stateless tensor transformations into a Sequential or Functional API model without creating a custom Layer subclass.
```
model = keras.Sequential([
    # ... other layers ...
    tf.keras.layers.Lambda(lambda x: x * 2), # Simple scaling layer
    # ... other layers ...
])
```

What are the limitations of `tf.keras.layers.Lambda`?

Lambda layers are generally stateless (they don't have trainable weights). If you need state, you should create a custom Layer.
They can sometimes be less portable than built-in layers or properly implemented custom layers when exporting to other formats.

What is the purpose of `tf.data.Dataset.from_tensor_slices()`? (Duplicate, but important)

`tf.data.Dataset.from_tensor_slices()` creates a dataset whose elements are the slices of the input tensors along their first dimension. It's a common way to create a dataset from in-memory NumPy arrays or TensorFlow tensors.

features = tf.constant([[1, 2], [3, 4], [5, 6]])
labels = tf.constant([0, 1, 0])
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
# Iterating over dataset yields (tf.Tensor([1, 2]), tf.Tensor(0)), (tf.Tensor([3, 4]), tf.Tensor(1)), ...

What is the purpose of `tf.keras.utils.plot_model()`?

`tf.keras.utils.plot_model()` generates a visual representation (a diagram) of a Keras model's architecture. It can show the layers, their connections, input/output shapes, and the flow of data through the model. Requires Graphviz to be installed.
```
tf.keras.utils.plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)
```

What is the purpose of `tf.lookup.TextFileToIntegerTable`?

`tf.lookup.TextFileToIntegerTable` is a static lookup table that maps strings from a text file to integer IDs. It's specifically designed for vocabulary files where each line is a word, and the line number (starting from 0 or 1) becomes the integer ID.

What is the purpose of `tf.autograph.to_code()`?

`tf.autograph.to_code()` is a debugging utility that shows the Python code generated by AutoGraph when converting a Python function into a TensorFlow graph. This can be helpful for understanding how AutoGraph transforms your code and for debugging AutoGraph-related issues.

What is the difference between `tf.GradientTape(watch_accessed_variables=True)` and the default?

By default, `tf.GradientTape` automatically watches all trainable `tf.Variable`s accessed within its context.
Setting `watch_accessed_variables=False` means the tape will *not* automatically watch trainable variables. You would then need to explicitly call `tape.watch(variable)` for any variables you want to compute gradients with respect to. This provides more fine-grained control but requires more manual effort. The default is usually sufficient.

What is the purpose of `tf.random.set_seed()`? (Duplicate, but important)

`tf.random.set_seed(seed)` sets a global seed for TensorFlow's random number generators. This helps ensure that operations that rely on randomness produce the same results when run multiple times with the same seed, which is crucial for reproducibility.

What is the purpose of `tf.data.AUTOTUNE`? (Duplicate, but important)

`tf.data.AUTOTUNE` is a special value used with `dataset.prefetch()` or `dataset.interleave()` that allows the TensorFlow runtime to dynamically tune the number of elements to prefetch or threads to use for interleaving, optimizing performance based on available resources.

What is the purpose of `tf.TensorArray`?

`tf.TensorArray` is a sequence of Tensors that can be dynamically written to. Unlike regular Tensors which have a fixed size and shape, `TensorArray`s are mutable. They are primarily used within `tf.function`s or graph execution to handle dynamic sequence processing (e.g., iterating through a sequence timestep by timestep) where the length of the output sequence might not be known beforehand.

When would you use `tf.TensorArray`?

Inside `tf.function`s when you need to build a sequence of tensors where the length is determined dynamically during execution (e.g., decoding sequences in sequence-to-sequence models).
When implementing custom RNN cells or loops in graph mode.

What is the purpose of `tf.keras.utils.Sequence`?

`tf.keras.utils.Sequence` is a utility class for creating custom data generators in Keras. It provides a thread-safe and multiprocess-safe way to generate batches of data during `model.fit()`. You subclass `Sequence` and implement `__len__()` (number of batches) and `__getitem__(idx)` (return the idx-th batch). It's an alternative to `tf.data` for custom data loading, especially when `tf.data` is difficult to use or when working with datasets that don't fit entirely in memory.

What is the purpose of `tf.experimental.numpy`?

`tf.experimental.numpy` is an experimental module that provides a NumPy-like API for TensorFlow tensors. It aims to make it easier for users familiar with NumPy to write TensorFlow code by providing functions that mimic the NumPy API. It's still under development but allows for more seamless switching between NumPy and TensorFlow.

What is the purpose of `tf.Module`? (Third time's the charm!)

`tf.Module` is the fundamental building block for creating custom, stateful components in TensorFlow. It allows you to group `tf.Variable`s and `tf.function`-decorated methods together. This structure enables checkpointing and exportability of your custom logic. It's the base class for `tf.keras.layers.Layer` and `tf.keras.Model`.

How does TensorFlow handle control flow (if, for, while) in graphs?

In TensorFlow 1.x graph mode, you had to use special TensorFlow control flow operations like `tf.cond` and `tf.while_loop`.
In TensorFlow 2.x with `@tf.function`, AutoGraph automatically converts standard Python control flow (`if`, `for`, `while`) into equivalent TensorFlow graph operations, allowing you to write more natural Python code that gets compiled into an efficient graph.

What is the purpose of `tf.lookup.DatasetInitializer`?

`tf.lookup.DatasetInitializer` is used to initialize a lookup table from a `tf.data.Dataset`. This is useful when your vocabulary or mapping is stored in a dataset format.

What is the purpose of `tf.data.experimental.enumerate_dataset()`?

`tf.data.experimental.enumerate_dataset()` transforms a dataset into a dataset of `(index, element)` pairs, similar to Python's `enumerate`. This is useful when you need the index of each element in the pipeline.

What is the purpose of `tf.data.Dataset.zip()`?

`tf.data.Dataset.zip()` combines multiple datasets element-wise. If you have separate datasets for features and labels, you can zip them together to get a dataset where each element is a tuple `(features, labels)`.
```
dataset_features = tf.data.Dataset.from_tensor_slices(x_train)
dataset_labels = tf.data.Dataset.from_tensor_slices(y_train)
combined_dataset = tf.data.Dataset.zip((dataset_features, dataset_labels))
```

What is the purpose of `tf.data.Dataset.interleave()`?

`tf.data.Dataset.interleave()` takes a dataset of datasets and interleaves their elements. This is useful for reading data from multiple files concurrently, improving data loading performance by overlapping I/O operations.

What is the purpose of `tf.data.Dataset.from_generator()`? (Another duplicate)

(Already covered) Creates a dataset from a Python generator function, useful for complex data loading.

What is the purpose of `tf.random.Generator`?

`tf.random.Generator` is a newer API for random number generation in TensorFlow 2.x. It offers more control over the random number generation state and allows for creating independent random streams, which is useful for distributed training and ensuring reproducibility in complex setups. It's an alternative to the older stateful ops like `tf.random.normal`.

What is the purpose of `tf.train.AdamOptimizer` vs `tf.keras.optimizers.Adam`? (TF1.x vs TF2.x)

`tf.train.AdamOptimizer`: The Adam optimizer implementation in TensorFlow 1.x. Worked with graph execution and sessions.
`tf.keras.optimizers.Adam`: The Adam optimizer implementation in TensorFlow 2.x (and part of Keras). Works seamlessly with Eager Execution and `tf.function`. This is the recommended version in TF2.x.

What is the purpose of `tf.saved_model.save()` and `tf.saved_model.load()`?

These are the lower-level functions for saving and loading SavedModels. `tf.keras.Model.save()` and `tf.keras.models.load_model()` are built on top of these. You might use the lower-level API if you are saving a custom `tf.Module` or need more fine-grained control over the saving process (e.g., specifying custom signatures).

What is the purpose of `tf.Module`? (Final attempt!)

`tf.Module` serves as a container for trainable variables and functions (`@tf.function`) you want to save and restore using `tf.train.Checkpoint` or export as part of a SavedModel. It provides a structured way to define custom, reusable components in TensorFlow that manage their own state. It's the foundation for Keras Models and Layers.

TensorFlow Tutorials

TensorFlow Interview