Skip to content

Getting "was not completed in time" error when preprocessing dataset #115

@AftabHussain

Description

@AftabHussain

Hi I am getting a preprocessing error (when invoking source preprocess.sh). I don't get any error when I preprocess the same dataset with code2vec. Appreciate any advice. Here's the error:

Extracting paths from training set...
dir: <dataset dir> was not completed in time
dir: <dataset dir> was not completed in time
dir: <dataset dir> was not completed in time
dir: <dataset dir> was not completed in time
Finished extracting paths from training set
Creating histograms from the training data
subtoken vocab size:  0
node vocab size:  0
target vocab size:  0
File: <dataset_name>.raw.txt
Traceback (most recent call last):
  File "preprocess.py", line 115, in <module>
    max_contexts=int(args.max_contexts), max_data_contexts=int(args.max_data_contexts))
  File "preprocess.py", line 53, in process_file
    print('Average total contexts: ' + str(float(sum_total) / total))
ZeroDivisionError: float division by zero

This is the line that is being triggered:

print('dir: ' + str(dir) + ' was not completed in time', file=sys.stderr)

Appreciate any thoughts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions