Back to Learn

The Math: Open-End Coder

The Open-End Coder uses Natural Language Processing (NLP) to group similar text responses into themes.

1. Embeddings

We convert each text response into a high-dimensional vector (embedding) using a Transformer model (e.g., Xenova/all-MiniLM-L6-v2) running locally in your browser via Web Assembly.

2. Clustering

We use K-Means clustering on these vectors to group semantically similar responses together. The number of clusters (k) is determined automatically or can be set manually.

3. Theme Extraction

For each cluster, we identify the most representative keywords (TF-IDF) or select the response closest to the cluster centroid as the "Theme Name".