:recycle: [Refactor] Replace output_path with html_path to avoid confuse 8c0b736 Hansimov commited on Jan 11
:boom: [Fix] WebpageFetcher: raise timeout when request.get hangs bce51d4 Hansimov commited on Jan 11
:zap: [Enhance] BatchWebpageFetcher: return url_and_output_path_list 4591d96 Hansimov commited on Jan 11
:gem: [Feature] New BatchWebpageFetcher: Fetch multiple urls concurrently e92817a Hansimov commited on Jan 11
:zap: [Enhance] Rename HTMLFetcher to WebpageFetcher, and add output_parent param 62ee9e4 Hansimov commited on Jan 10
:recycle: [Refactor] WebpageContentExtractor: Separate html and markdown processing a636bcb Hansimov commited on Jan 10
:zap: [Enhance] HTMLFetcher and GoogleSearcher: support cache with overwrite, and ignore host cf4c3f8 Hansimov commited on Jan 10
:recycle: [Refactor] HTMLFetcher: replace save_path with output_path 7d44e75 Hansimov commited on Jan 10
:zap: [Enhance] GoogleSearcher: Add params of result_sum and safe 8bf48d8 Hansimov commited on Jan 10
:gem: [Feature] New FilepathConverter: convert urls and queries to valid file path 64a0dbf Hansimov commited on Jan 7
:recycle: [Refactor] Move header constructor, and prettier logging e448a74 Hansimov commited on Jan 6
:gem: [Feature] New GoogleSearcher: Enable google search with query 6cf0820 Hansimov commited on Jan 6