data_juicer.analysis.overall_analysis module¶
- class data_juicer.analysis.overall_analysis.OverallAnalysis(dataset, output_path)[源代码]¶
基类:
object
Apply analysis on the overall stats, including mean, std, quantiles, etc.
- __init__(dataset, output_path)[源代码]¶
Initialization method.
- 参数:
dataset -- the dataset to be analyzed
output_path -- path to store the analysis results.
- analyze(percentiles=[], num_proc=1, skip_export=False)[源代码]¶
Apply overall analysis on the whole dataset based on the describe method of pandas.
- 参数:
percentiles -- percentiles to analyze
num_proc -- number of processes to analyze the dataset
skip_export -- whether export the results to disk
- 返回:
the overall analysis result.