data_juicer.analysis.column_wise_analysis module¶
- data_juicer.analysis.column_wise_analysis.get_row_col(total_num, factor=2)[源代码]¶
Given the total number of stats figures, get the "best" number of rows and columns. This function is needed when we need to store all stats figures into one image.
- 参数:
total_num -- Total number of stats figures
factor -- Number of sub-figure types in each figure. In default, it's 2, which means there are histogram and box plot for each stat figure
- 返回:
"best" number of rows and columns, and the grid list
- class data_juicer.analysis.column_wise_analysis.ColumnWiseAnalysis(dataset, output_path, overall_result=None, save_stats_in_one_file=True)[源代码]¶
基类:
object
Apply analysis on each column of stats respectively.
- __init__(dataset, output_path, overall_result=None, save_stats_in_one_file=True)[源代码]¶
Initialization method
- 参数:
dataset -- the dataset to be analyzed
output_path -- path to store the analysis results
overall_result -- optional precomputed overall stats result
save_stats_in_one_file -- whether save all analysis figures of all stats into one image file
- analyze(show_percentiles=False, show=False, skip_export=False)[源代码]¶
Apply analysis and draw the analysis figure for stats.
- 参数:
show_percentiles -- whether to show the percentile line in each sub-figure. If it's true, there will be several red lines to indicate the quantiles of the stats distributions
show -- whether to show in a single window after drawing
skip_export -- whether save the results into disk
- 返回:
- draw_hist(ax, data, save_path, percentiles=None, show=False)[源代码]¶
Draw the histogram for the data.
- 参数:
ax -- the axes to draw
data -- data to draw
save_path -- the path to save the histogram figure
percentiles -- the overall analysis result of the data including percentile information
show -- whether to show in a single window after drawing
- 返回:
- draw_box(ax, data, save_path, percentiles=None, show=False)[源代码]¶
Draw the box plot for the data.
- 参数:
ax -- the axes to draw
data -- data to draw
save_path -- the path to save the box figure
percentiles -- the overall analysis result of the data including percentile information
show -- whether to show in a single window after drawing
- 返回: