URLAnalyzer

URLAnalyzer(url: str)
A class that analyzes URLs to determine if they point to web pages or files.
Initialize the URLAnalyzer with a URL.
Parameters:
NameDescription
urlType: str

Class Attributes

FormatToMimeType



MimeTypeToFormat



format_type



mime_type



mime_types



Static Methods

get_supported_extensions

get_supported_extensions() -> list[str]
Return a list of supported file extensions.

get_supported_formats

get_supported_formats() -> list[InputFormat]
Return a list of supported file formats.

get_supported_mime_types

get_supported_mime_types() -> list[str]
Return a list of all supported MIME types.

Instance Methods

analyze

analyze(
    self,
    test_url: bool = False,
    follow_redirects: bool = True,
    prioritize_extension: bool = True
) -> dict[str, Any]
Analyze the URL to determine if it points to a web page or a file.
Parameters:
NameDescription
test_urlWhether to test the URL by making a request

Type: bool

Default: False
follow_redirectsWhether to follow redirects when testing the URL

Type: bool

Default: True
prioritize_extensionWhether to prioritize file extension over MIME type

Type: bool

Default: True
Returns:
TypeDescription
dict[str, typing.Any]dict: A dictionary containing the analysis results

follow_redirects

follow_redirects(self) -> Tuple[str, list[str]]
Follow redirects for the URL without analyzing content types.
Returns:
Tuple[str, list[str]]: The final URL and the redirect chain
Returns:
TypeDescription
Tuple[str, list[str]]Tuple[str, list[str]]: The final URL and the redirect chain

get_redirect_info

get_redirect_info(self) -> dict[str, Any]
Get information about redirects that occurred during the last request.
Returns:
dict: Information about redirects
Returns:
TypeDescription
dict[str, typing.Any]dict: Information about redirects

get_result

get_result(self) -> dict[str, Any] | None
Get the last analysis result, or None if the URL hasn’t been analyzed yet.
Returns:
Optional[dict]: The analysis result or None
Returns:
TypeDescription
dict[str, typing.Any] | NoneOptional[dict]: The analysis result or None