Exploring Autoencoder-based Representations for Tabular Data Classification

Il’murat Tokhtakhunov; Marat Nurtas; Alex; er Neftissov; Sharofiddin Pirnaev Ilyas Kazambayev; Lalita Kirichenko

From The Journal:

Engineered Science

Volume 37, 2025

Exploring Autoencoder-based Representations for Tabular Data Classification

Authors
Authors and affiliations

Il’murat Tokhtakhunov, Marat Nurtas, Alexander Neftissov, Sharofiddin Pirnaev Ilyas Kazambayev and Lalita Kirichenko

Il’murat Tokhtakhunov^1,2, Email

Marat Nurtas^{1,3, Email}

Alexander Neftissov^4,5, Email

Sharofiddin Pirnaev^{6, Email}

Ilyas Kazambayev^4,5

Lalita Kirichenko^4,5

¹Department of Mathematical and Computer Modelling, International Information Technology University, 34/1 Manas street, Almaty, 05000, Kazakhstan
²School of Digital Technologies, Narxoz University, 55 Zhandosov street, Almaty, 050035, Kazakhstan
³Faculty of Information technology, Al-Farabi Kazakh National University, 71 Al-Farabi Avenue, Almaty, 050040, Kazakhstan
⁴Science Innovation Center Industry 4.0, Astana IT University, Mangilik El C1, Astana, 010000, Kazakhstan
⁵Academy of Physical Education and Mass Sports, Mangilik El B2.2, Astana, 010000, Kazakhstan
⁶Department of Engineering Technological Machines, Tashkent State Transport University, 1 Temiryolchilar street, Mirabad district, Tashkent, 100167, Uzbekistan

Abstract

Autoencoders are evaluated as a means of constructing compact and informative vector representations for classification tasks involving high-dimensional tabular data. The methodology addresses the limitations of traditional models that rely on manual feature engineering and task-specific training. Emphasis is placed on building a generalized look-alike model for targeted advertising, using embeddings derived from subscriber-related entities. The approach is assessed on a real-world telecommunications dataset comprising subscriber demographics, devices, tariffs, and network characteristics. Experimental results demonstrate that embeddings produced by autoencoders outperform classical dimensionality reduction methods such as Principal Component Analysis (PCA), both in predictive quality and computational efficiency. Compressed representations enable the identification of nonlinear patterns and semantic similarities, improving classification accuracy across multiple metrics. The study further introduces an integrated vector architecture by concatenating embeddings from heterogeneous entities. Cosine similarity is employed as a metric for identifying similar users, enabling the development of a scalable and automated recommendation service for Business-to-Business (B2B) applications. Performance is benchmarked using traditional quality metrics (precision, recall, Harmonic Mean of Precision and Recall (F1-score), Receiver Operating Characteristic – Area Under the Curve (ROC AUC)) as well as business-specific indicators such as conversion rate and lift. The findings support the applicability of autoencoders in modeling complex tabular structures with minimal information loss. Prospects include the development of domain-specific autoencoder ensembles and the exploration of alternative vector similarity metrics for broader industrial adoption. The suggested solution can be applied for water resource monitoring system as improvement for classification and further prediction.

Download PDF Open in new windows

Article HTML Open in new windows

About
Cited by

Publication details

Received: 03 Jul 2025
Revised: 06 Aug 2025
Accepted: 13 Aug 2025
Published online: 11 Sep 2025

Article type:
Research Paper

DOI:
10.30919/es1703

Volume:
37

Article :
1703

Citation:
Engineered Science, 2025, 37, 1703

.RIS .ENW .BIB

Permissions:
Copyright

Number of downloads:
64

Citation Information:
0

Description:

An embedding-based feature vector is introduced for generalized look-alike modeling, utilizing compr....

No citation information available

Events

Guidelines

Policies

Videos

Submit

Newsroom

Recommend to Libarian

Engineered Science

Exploring Autoencoder-based Representations for Tabular Data Classification

Abstract

Publication details

Quick Links :

General Links :

About ESP :

Follow ESP :