Commit History

:zap: [Enhance] WebpageContentExtractor: Escape dash, and ignore
c7c538d

Hansimov commited on

:zap: [Enhance] ignore classes pattern, especially for 163.com
3dda344

Hansimov commited on

:zap: [Enhance] Rename HTMLFetcher to WebpageFetcher, and add output_parent param
62ee9e4

Hansimov commited on

:zap: [Enhance] SearchAPIApp: overwrite param for query and webpage HTML
9fb4731

Hansimov commited on

:recycle: [Refactor] WebpageContentExtractor: Separate html and markdown processing
a636bcb

Hansimov commited on

:recycle: [Refactor] Move hardcoded consts to network_configs
af2c647

Hansimov commited on

:zap: [Enhance] HTMLFetcher and GoogleSearcher: support cache with overwrite, and ignore host
cf4c3f8

Hansimov commited on

:gem: [Feature] SearchAPIApp: New extract_content param
4d3e890

Hansimov commited on

:gem: [Feature] New WebpageContentExtractor: extract webpage content as clean markdown
e773696

Hansimov commited on

:recycle: [Refactor] HTMLFetcher: replace save_path with output_path
7d44e75

Hansimov commited on

:gem: [Feature] Enable SearchAPIApp: /queries_to_search_results
138c09e

Hansimov commited on

:zap: [Enhance] GoogleSearcher: Add params of result_sum and safe
8bf48d8

Hansimov commited on

:recycle: [Refactor] Rename SearchResultsExtractor to QueryResultsExtractor, and store results
0f6452f

Hansimov commited on

:zap: [Enhance] FilepathConverter: New parent param when init
f9c42cf

Hansimov commited on

:gem: [Feature] New HTMLFetcher: download url to local html file
b259fec

Hansimov commited on

:gem: [Feature] New FilepathConverter: convert urls and queries to valid file path
64a0dbf

Hansimov commited on

:recycle: [Refactor] Move header constructor, and prettier logging
e448a74

Hansimov commited on

:gem: [Feature] SearchResultsExtractor: related questions
f150f6b

Hansimov commited on

:gem: [Feature] New SearchResultsExtractor: title, site, link, abstract
ef3de03

Hansimov commited on

:pencil: [Doc] Readme and git ignore
d6015f4

Hansimov commited on

:gem: [Feature] New Enver and Logger
f2ec1a1

Hansimov commited on

:gem: [Feature] New GoogleSearcher: Enable google search with query
6cf0820

Hansimov commited on