Machine Learning

Transformers: how do they work internally?

Table of Contents Introduction Input Embedding Positional Encoding (PE) The Encoder Self-attention Mecanism Multi-head attention mechanism Feedforward Network The Decoder Masked Multi-head Attention Multi-head Attention Feedforward Network Linear and Softmax Layer Transformer Training Conclusion Introduction The Transformer is currently one of the most popular architectures for NLP. We can periodically...

Continue reading...

TensorFlow High-Level Libraries: TF Estimator

TensorFlow has several high-level libraries allowing us to reduce time modeling all with core code. TF Estimator makes it simple to create and train models for training, evaluating, predicting and exporting. TF Estimator provides 4 main functions on any kind of estimator: estimator.fit() estimator.evaluate() estimator.predict() estimator.export() All predefined estimators are...

Continue reading...

Loss Functions (Part 1)

Implementing Loss Functions is very important to machine learning algorithms because we can measure the error from the predicted outputs to the target values. Algorithms get optimized by evaluating outcomes depending on a specified loss function, and TensorFlow works in this way as well. We can think on Loss Functions...

Continue reading...

Activation Functions (updated)

Table of Contents What is an activation function? Activation Functions Sigmoid ReLU (Rectified Linear Unit) ReLU6 Hyperbolic Tangent ELU (Exponential Linear Unit) Softmax Softplus Softsign Swish Sinc Leaky ReLU Mish GELU (Gaussian Error Linear Unit) SELU (Scaled Exponential Linear Unit) What is an activation function? An activation function is a...

Continue reading...