Skip to content



Repository files navigation

  • Student name: Tamjid Ahsan
  • Student pace: Full Time
  • Scheduled project review date/time: May 27, 2021, 05:00 PM [DST]
  • Instructor name: James Irving


The garment industry one of the highly labor-intensive industries that needs large number of human resources to be efficient and keep up with demand for garment products across the globe. Because of this inherent dependency on human capital, the production of a garment company comprehensively relies on the productivity of the employees in different departments. Often actual productivity of the garment employees is not in line with targeted productivity that was set. This is a high priority for a organization to achieve deadline and maximize profit by ensuring proper utilization of resources. When any productivity gap occurs, the company faces a huge loss in production.


A garment production pipeline consists of a handful of sequential processes, e.g., designing, sample confirmation, sourcing and merchandising, lay planning, marker planning, spreading and cutting, sewing, washing, finishing, and packaging, and then exporting if the order is a international one. An efficient garment production always consists of a line plan with explicit details of when the production will be started, how many pieces are expected, and when the order needs to be completed. To complete a whole production within a target time, these sequential processes need to be to performed efficiently. In order to meet the production goals, the associated industrial engineers strategically set a targeted productivity value against each working team in the manufacturing process. However, it is a common scenario that the actual productivity does not align with the target for several factors, both internal and external.

I shall use various machine learning techniques for predicting the productivity of the garment employees based on their previous data of internal factors.

More specifically:

  • Predict bad performance of workers. Optimize model for precision.
    • Focus on predicting bad performance, don't want to miss much of those.
    • Focus on maximizing true negatives and minimizing false positives while tackling model overfitting.



The data is obtained from UCI Machine Learning Repository, titled "Productivity Prediction of Garment Employees Data Set" by Abdullah Al Imran[1]. Which can be found here. A copy of the data is in this repository at /data/garments_worker_productivity.csv.

The collected dataset contains the production data of the sewing and finishing department for three months from January 2015 to March 2015 of a renowned garment manufacturing company in Bangladesh[2]. The dataset consists of 1197 instances and includes 13 attributes.

Features with explanation.
  • date: Date in MM-DD-YYYY format.
  • day: Day of the Week.
  • quarter: A portion of the month. A month was divided into four quarters.
  • department: Associated department with the instance.
  • team_no: Associated team number with the instance.
  • no_of_workers: Number of workers in each team.
  • no_of_style_change: Number of changes in the style of a particular product.
  • targeted_productivity: Targeted productivity set by the Authority for each team for each day.
  • smv: Standard Minute Value, it is the allocated time for a task.
  • wip: Work in progress. Includes the number of unfinished items for products.
  • over_time: Represents the amount of overtime by each team in minutes.
  • incentive: Represents the amount of financial incentive (in BDT[3]) that enables or motivates a particular course of action.
  • idle_time: The amount of time when the production was interrupted due to several reasons.
  • idle_men: The number of workers who were idle due to production interruption.
  • actual_productivity: The actual % of productivity that was delivered by the workers. It ranges from 0-1[4].

[1] Rahim, M. S., Imran, A. A., & Ahmed, T. (2021). Mining the Productivity Data of Garment Industry. International Journal of Business Intelligence and Data Mining, 1(1), 1.
[2] Bangladesh is a developing country which is the second largest apparel exporting country in the world.
[3] 1 USD = 84.83 BDT, as of May 23,2021. Check here to see rates from Bangladesh Bank.
[4] Measured by production production engineers of the organization. Methodology of this calculation is not public.


doi = {10.1504/ijbidm.2021.10028084},
url = {[Web Link]},
year = 2021,
publisher = {Inderscience Publishers},
volume = {1},
number = {1},
pages = {1},
author = {Md Shamsur Rahim and Abdullah Al Imran and Tanvir Ahmed},
title = {Mining the Productivity Data of Garment Industry},
journal = {International Journal of Business Intelligence and Data Mining}



# loading data from local source
df = pd.read_csv('./data/garments_worker_productivity.csv')
# 10 sample of the dataset. Data loading successful.
date quarter department day team targeted_productivity smv wip over_time incentive idle_time idle_men no_of_style_change no_of_workers actual_productivity
797 2/16/2015 Quarter3 finishing Monday 5 0.75 4.15 NaN 1200 0 0.0 0 0 10.0 0.629417
693 2/10/2015 Quarter2 finishing Tuesday 2 0.80 3.94 NaN 2160 0 0.0 0 0 18.0 0.966759
530 1/31/2015 Quarter5 finishing Saturday 11 0.65 3.94 NaN 600 0 0.0 0 0 5.0 0.971867
325 1/19/2015 Quarter3 sweing Monday 2 0.70 22.94 1006.0 10170 38 0.0 0 0 56.5 0.750518
1055 3/4/2015 Quarter1 sweing Wednesday 3 0.80 29.40 1169.0 6840 63 0.0 0 0 57.0 0.800333
1036 3/3/2015 Quarter1 finishing Tuesday 8 0.75 4.60 NaN 3360 0 0.0 0 0 8.0 0.702778
890 2/23/2015 Quarter4 sweing Monday 5 0.80 30.10 541.0 7140 38 0.0 0 0 59.0 0.800137
470 1/27/2015 Quarter4 sweing Tuesday 9 0.70 29.12 1294.0 6960 50 0.0 0 0 58.0 0.700386
597 2/3/2015 Quarter1 finishing Tuesday 6 0.70 2.90 NaN 960 0 0.0 0 0 8.0 0.495417
945 2/26/2015 Quarter4 sweing Thursday 7 0.80 30.10 694.0 4080 50 0.0 0 1 59.0 0.800809


  • every feature has correct data type except team.

  • team is a categorical data which is labeled numerically.

  • wip has NaN values. Those are not missing. For those days where there were no work in progress, data is empty. Those can be safely filled with 0.

  • smv depends on product.

  • department has issue with naming

  • Value of 'Quarter5' in quarter is inconsistent with data description. Those data are for January 29, Thursday; and 31, Saturday of 2015. I can not come up with any rational for this treatment, Thus leaving it at is. Another option is to merge these with Quarter4. every other feature is clean and coherent.


  • None of them are normally distributed.
  • Most of them are skewed. e.g., idle_men, idle_time, incentive, wip.
  • target has few regular occurring values.
  • smv, overtime has some very high values

Feature engineering

Creating target; performance

I am treating this as a binary classification model. For this I am converting actual_productivity into a binary class. Logic behind this operation is, if the actual_productivity is greater than targeted_productivity then its a 1, and 0 otherwise. I am not encoding in text as most of the model requirs numerical data in target. This eliminates the need for label encoding. And for binary classification this does not create confusion while looking at reports of model performance.

1    0.730994
0    0.269006
Name: performance, dtype: float64

Straight away I can spot a class imbalance issue. I have to address this later while modeling.

Cleaning wip

  • filling NaN's with 0, meaning no wip for that session

Text cleaning in department categories

Cleaning quarter

  • as identified before, cleaning by merging Quarter5 with Quarter4

Cleaning targeted_productivity

targeted_productivity stats:
 0.8	 : mode
 0.73	 : mean
 0.7	 : 25% quantile


From this plot I can safely assume that this a data entry error. Setting a target so low does not make any sense. I am filling this with the 25% quantile .

No error remains.

Drop features

Dropping date as this is not useful for modeling and timing is captured in day and quarter features, actual_productivity as this is the target in continuous format.


which department has better performance


Finishing department and sewing department has similar targets and finishing department often fail to meet daily goal. This can be explained by the small size of finishing department. Adding a few workers can be beneficial.

productive day of the week


Overall same pattern with slight high level of goal not met on Sunday and Thursday.

exploring team

team size


Generally finishing department worker size is low.

efficient team


Finishing department fails to achieve goal more often.

wip on performance


At higher wip there is less chance of failing.


Same pattern, low wip does not necessarily mean a good workday. Some leftover work for the next day can mean that there is a greater chance of meeting that days goal.

incentive on performance


incentive distribution is highly skewed. lets slice by 500 BDT.


Most of the day there is no incentive payment.


After binning it can be seen that at higher incentive the performance is better, as no goal unmet at generous incentive.

preparing data for model

split using sklearn

I am using train-test split approach here. Other option is to use train-validation-test data split approach. As the data set is relatively small, the later approach makes my train data have fewer samples to train on. This is a real issue for model performance for some of the models used. They perform better with more train data.

X = df.drop(columns='performance').copy()
y = df['performance'].copy()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.25)
Class balance y_train: 
1    0.729097
0    0.270903
Name: performance, dtype: float64

Class balance y_test: 
1    0.736667
0    0.263333
Name: performance, dtype: float64

Distribution of target class is somewhat consistent. Can be re-run for different distribution. But this is not necessary as I am tackling class imbalance issue with SMOTENC.

Addressing class imbalance using SMOTENC

# keeping a class imbalanced dataset set for evaluation of preocess
X_imba, y_imba = X.copy(), y.copy()
# creating a copy just to be safe.
XX = X.copy()
yy = y.copy()
# first four features are categorical;
# in the original paper (Rahim, 2021) attached to this dataset also 
# considered `team` as categorical feature.
# initialize SMOTENC
oversampling = SMOTENC(categorical_features=smotenc_features,n_jobs=-1)
# fiting
XX_oversampled, yy_oversampled = oversampling.fit_sample(XX,yy)
# updating dataset
X_train, y_train = XX_oversampled.copy(), yy_oversampled.copy()


I used pipeline from sklraen with a custom function to transform data. These are available in this repository.


dummy model

# SMOTENC'ed, StandardScaled and OHE'ed data
X_train_dummy, X_test_dummy = fun.dataset_preprocessing_pipeline(
    X_train, X_test)

dummy_classifier = DummyClassifier(strategy='stratified')
Class balance y_train: 
1    0.5
0    0.5
Name: performance, dtype: float64

Class balance y_test: 
1    0.736667
0    0.263333
Name: performance, dtype: float64


Report of DummyClassifier type model using train-test split dataset.

Train accuracy score: 0.4931
Test accuracy score: 0.5
    No over or underfitting detected, diffrence of scores did not cross 5% thresh hold.

Classification report on train data of:
              precision    recall  f1-score   support

           0       0.52      0.53      0.52       875
           1       0.52      0.52      0.52       875

    accuracy                           0.52      1750
   macro avg       0.52      0.52      0.52      1750
weighted avg       0.52      0.52      0.52      1750



Classification report on test data of:
              precision    recall  f1-score   support

           0       0.28      0.48      0.36        79
           1       0.75      0.56      0.64       221

    accuracy                           0.54       300
   macro avg       0.52      0.52      0.50       300
weighted avg       0.63      0.54      0.57       300



This is a worthless model. The f1 score is low, model accuracy is .5. This is not even better than flipping a coin to predict, which should be correct at random.

logistic regression

filter with Pearson corr


Most of them are correlated with the target except no_of_style_change and targeted_productivity.


  • No significant correlation is detected except no_of_worker and smv.

  • overtime and no_of_worker is correlated.

    Features should be dropped: {'no_of_workers'}

# droping from train and test data
X_train_dropped_ = X_train.drop('no_of_workers',axis=1)
X_test_dropped_ = X_test.drop('no_of_workers',axis=1)
# SMOTENC'ed, StandardScaled, correlated feature dropped and OHE'ed data
X_train_log_reg, X_test_log_reg = fun.dataset_preprocessing_pipeline(
    X_train_dropped_, X_test_dropped_, drop='first')

logistic regression classifier

# logistic regression classifier
logreg = LogisticRegression(C=1e5, max_iter=1000, class_weight='balanced')
# score of logistic regression classifier

Report of LogisticRegression type model using train-test split dataset.

Train accuracy score: 0.7194
Test accuracy score: 0.7367
    No over or underfitting detected, diffrence of scores did not cross 5% thresh hold.

Classification report on train data of:
        LogisticRegression(C=100000.0, class_weight='balanced', max_iter=1000)
              precision    recall  f1-score   support

           0       0.71      0.74      0.72       875
           1       0.73      0.70      0.71       875

    accuracy                           0.72      1750
   macro avg       0.72      0.72      0.72      1750
weighted avg       0.72      0.72      0.72      1750



Classification report on test data of:
    LogisticRegression(C=100000.0, class_weight='balanced', max_iter=1000)
              precision    recall  f1-score   support

           0       0.52      0.79      0.63        84
           1       0.90      0.72      0.80       216

    accuracy                           0.74       300
   macro avg       0.71      0.75      0.71       300
weighted avg       0.79      0.74      0.75       300



Overall average performance. Can detect majority of true negatives and positives, with good recall and and f1. ROC curve also looks good.


By looking at the coefs of the model, I can have a idea of feature importance and their impact on the prediction. department_sweing and wip has highest coef. and some teams are under performing, but teams of both department share number as identifier.

grid search with Cross Validation

Grid search parameters:

logreg_gs = LogisticRegression(max_iter=1e4,
params = {
    'C': [.1, 1, 10, 100, 10000, 1e6, 1e12],
    'tol': [0.0001, 0.001, 0.01, .1],
    'penalty': ['l1', 'l2', 'elasticnet', None],
    'solver': ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']
gridsearch_logreg = GridSearchCV(estimator=logreg_gs,


Best Parameters by gridsearch:	{'C': 0.1, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.0001}
Best Estimator by gridsearch:	LogisticRegression(C=0.1, class_weight='balanced', max_iter=10000.0, n_jobs=-1,
                   penalty='l1', solver='saga')
logreg_gs_best = gridsearch_logreg.best_estimator_
fun.model_report(logreg_gs_best, X_train_log_reg, y_train, X_test_log_reg,

Report of LogisticRegression type model using train-test split dataset.

Train accuracy score: 0.728
Test accuracy score: 0.7467
    No over or underfitting detected, diffrence of scores did not cross 5% thresh hold.

Classification report on test data of:
    LogisticRegression(C=0.1, class_weight='balanced', max_iter=10000.0, n_jobs=-1,
                   penalty='l1', solver='saga')
              precision    recall  f1-score   support

           0       0.53      0.80      0.64        84
           1       0.90      0.73      0.81       216

    accuracy                           0.75       300
   macro avg       0.72      0.76      0.72       300
weighted avg       0.80      0.75      0.76       300



Very minimal improvement overall.

At this point I can tackle outliers by removing them based on Z-score or IQR or other method; and considering scaling options can be done here. But chance of data loss is higher. Moreover, disruption of distribution of data is required for this process.
Moving on to next type of model.

KNN Clustering

X_train_knn, X_test_knn = fun.dataset_preprocessing_pipeline(X_train, X_test)

knn = KNeighborsClassifier()

Report of KNeighborsClassifier type model using train-test split dataset.

Train accuracy score: 0.8766
Test accuracy score: 0.8633
    No over or underfitting detected, diffrence of scores did not cross 5% thresh hold.

Classification report on train data of:
              precision    recall  f1-score   support

           0       0.86      0.90      0.88       875
           1       0.90      0.85      0.87       875

    accuracy                           0.88      1750
   macro avg       0.88      0.88      0.88      1750
weighted avg       0.88      0.88      0.88      1750



Classification report on test data of:
              precision    recall  f1-score   support

           0       0.72      0.85      0.78        84
           1       0.94      0.87      0.90       216

    accuracy                           0.86       300
   macro avg       0.83      0.86      0.84       300
weighted avg       0.87      0.86      0.87       300



Way better performance than previous model. True negative and positives are better, all the metrics are looking good. ROC curve is improved. But this can be better better by some hyperparameter tuning.

grid search with Cross Validation

These are the grid search parameters.

knn_gs = KNeighborsClassifier(n_jobs=-1)
params = {
    'n_neighbors': list(range(1, 31, 2)),
    'weights': ['uniform', 'distance'],
    'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'],
    'p': [1, 2, 2.5, 3, 4],
    'leaf_size': [30, 40]
gridsearch_knn = GridSearchCV(estimator=knn_gs,


Best Parameters by gridsearch:	{'algorithm': 'auto', 'leaf_size': 30, 'n_neighbors': 17, 'p': 1, 'weights': 'distance'}
Best Estimator by gridsearch:	KNeighborsClassifier(n_jobs=-1, n_neighbors=17, p=1, weights='distance')

Report of KNeighborsClassifier type model using train-test split dataset.

Train accuracy score: 0.9994
Test accuracy score: 1.0
    No over or underfitting detected, diffrence of scores did not cross 5% thresh hold.

Classification report on test data of:
    KNeighborsClassifier(n_jobs=-1, n_neighbors=17, p=1, weights='distance')
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        84
           1       1.00      1.00      1.00       216

    accuracy                           1.00       300
   macro avg       1.00      1.00      1.00       300
weighted avg       1.00      1.00      1.00       300



Perfect result. It can predict with certainty. It has all the scores perfect across all the metrics and ROC curve is perfect.

{'algorithm': 'auto',
 'leaf_size': 30,
 'metric': 'minkowski',
 'metric_params': None,
 'n_jobs': -1,
 'n_neighbors': 17,
 'p': 1,
 'weights': 'distance'}

These are the best parameters.

ensemble methods

X_train_ensbl, X_test_ensbl = fun.dataset_preprocessing_pipeline(
    X_train, X_test)

Random Forest™

Random Forest is a trademark of Leo Breiman and Adele Cutler and is licensed exclusively to "Salford Systems", subsidiary of "Minitab, LLC", for the commercial release of the software. Random Forest A.K.A. random decision forests. This is one of the extensively used black-box models. KKN and RF can be both classified as weighted neighborhoods schemes. I am using scikit-learn's implementation of the concept.

RF generally requires less tuning for acceptable performance. Thus I am using random decision forest here, as I got a good result using KNN after some hyperparameter tuning via grid search with cross validation.

rf_clf = RandomForestClassifier()
fun.model_report(rf_clf, X_train_ensbl, y_train, X_test_ensbl,

Report of RandomForestClassifier type model using train-test split dataset.

Train accuracy score: 0.9994
Test accuracy score: 0.9967
    No over or underfitting detected, diffrence of scores did not cross 5% thresh hold.

Classification report on test data of:
              precision    recall  f1-score   support

           0       0.99      1.00      0.99        79
           1       1.00      1.00      1.00       221

    accuracy                           1.00       300
   macro avg       0.99      1.00      1.00       300
weighted avg       1.00      1.00      1.00       300



As expected I have the same level of performance with the out-of-the-box model without any tuning. Lets look at the first few nods of the tree of the 10th tree. I choose 10th at random. This output is not friendly to see in a notebook. A copy of this can be found at './saved_model/rf_clf_sample_4.pdf' in side this repository as a pdf file.

Parameter used for the model:
{'bootstrap': True, 'ccp_alpha': 0.0, 'class_weight': None, 'criterion': 'gini', 'max_depth': None, 'max_features': 'auto', 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 100, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}


Most important features are, in descending order, incentive, smv, overtime, no_of_workers, targeted_productivity and so on. This means those features were often used for building the forests. And those features make real life sense also.

Selecting Best model

Random Forest model is the best one, this can achieve perfect prediction with minimal effort.



This plot is showing contribution of each feature on a machine learning model prediction. This graph is for detecting goal met class.

  • Few explanations:

Features Probability of goal met Probability of goal not met
department_finishing Lower Value Higher Value
department_sweing High Value Lower Value
idle_men Lower Value
idle_time Lower Value
incentive High Value Lower Value
no_of_workers Above Average Below Average And High
over_time Average Below Average And High
smv Above Average Below Average And High
targeted_productivity Lower Value Higher Value
wip High Value Lower Value


This model can be used with confidence for predicting employee performance. It can detect both True negatives and positives with high precision.

  • Few insights where to focus
    • incentive is very important decider for performance.
    • tune optimal no_of_workers for better performance.
    • manpower assignment in team, more specificly for finishing should be re-evaluated as they are underperforming
    • based on observation of smv employees either need training or allocation of time for task (fast and long jobs) shold be re-considered.


  • do a multi-class prediction by further binning of target.
  • fit a model with entire data and prepare for production use.
  • fine-tune functions for general use and add options for users.
  • mend appendix contents


No description, website, or topics provided.







No releases published


No packages published


  • Jupyter Notebook 99.5%
  • Python 0.5%