DiffParser#

Parser#

diffparser.parser.parse_diff(diff, path_included)[source]#

This function will take a diff in the form of a string, parse it, and return it in the form of a dictionary in the following format: {‘oldCode’: [], ‘newCode’: []}

If the diff contains a path with description of the filename for each change, the path_included parameter should be True, otherwise false.

If the path is included, the returned dictionary will return one dictionary for each filename that is included, and the format would loke like this: {‘filename’: {‘oldCode’: [], ‘newCode’:[]}}

Parameters
  • str (diff) – String containing the diff to be parsed.

  • bool (path_included) – True if the diff contains the path to each file, otherwise False.

Returns

dict containing the separated diff in format described above

Return type

separated_diff dict

diffparser.parser.parse_list_of_commits(commits, path_included, diff_key='diff', changed_key='changedFiles')[source]#

This function will take a list of commits as input and returns a parsed version of the same list. A commit in this case is a dictionary containing a string with the diff, among other things.

Parameters
  • commits (list) – List of dicts, where each dict represents a commit, which in turn contains a diff.

  • path_included (bool) – True if the diff contains the path to each file, otherwise False.

  • diff_key (str) – String describing the entry of the commit that contains the diff. Default: ‘diff’

  • changed_key (str) – String describing the entry of the commit that contains the changed files. Default: ‘changedFiles’

Returns

Returns the same list as the input parameter, but with entries for oldCode and newCode.

Return type

commits list

diffparser.parser.parsed_to_txt(parsed_lst, output_path, changed_key=None)[source]#

Takes a list of parsed commits and writes the old code to the src file, and the new code to the tgt file.

Parameters
  • list (parsed_lst) – List of parsed commits.

  • string (output_path) – String which represents the path to the directory which the files would be written to.

Return type

None

diffparser.parser.read_json(file_name)[source]#

Reads a Json file.

Parameters

str (file_name) – String representing the path/filename of the datafile.

Returns

In this program, we want the json to be a list of dictionaries, and therefore, the returned information should be a list of dicts as well.

Return type

data list

diffparser.parser.read_jsonl(file_name)[source]#

Reads a Jsonl file.

Parameters

str (file_name) – String representing the path/filename of the datafile.

Returns

In this program, we want the jsonl to be a list of dictionaries, and therefore, the returned information should be a list of dicts as well.

Return type

data list