Module backend.classifier.classifiers.seo_rule_based_import.libs.indicators

Functions

def calculate_loading_time(url)

Function to calculate the loading time of a URL.

Args

url : str
The URL to calculate the loading time for.

Returns

float
The loading time in seconds, or -1 if it cannot be calculated.
def create_webdriver()

Function to extract hyperlinks from the source code of a webpage.

Args

source : str
The source code of the webpage.
main : str
The main URL of the website.

Returns

str
The extracted hyperlinks.
def get_netloc(url)

Function to get the netloc (domain) of a URL.

Args

url : str
The URL to get the netloc from.

Returns

str
The netloc of the URL.
def get_plugins()

Function to get the list of plugins from the configuration file.

Returns

list
The list of plugins.
def get_scheme(url)

Function to get the scheme (http or https) of a URL.

Args

url : str
The URL to get the scheme from.

Returns

str
The scheme of the URL.
def get_sources()

Function to get the list of sources from the configuration file.

Returns

list
The list of sources.
def identify_canonical(source)

Function to identify the number of canonical links in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

int
The number of canonical links.
def identify_description(source)

Function to identify if a webpage has a meta description.

Args

source : str
The source code of the webpage.

Returns

int
1 if a meta description is present, 0 otherwise.
def identify_h1(source)

Function to identify the number of H1 tags in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

int
The number of H1 tags.
def identify_https(url)

Function to identify if a URL uses HTTPS.

Args

url : str
The URL to identify.

Returns

int
1 if the URL uses HTTPS, 0 otherwise.

Function to identify the number of internal and external hyperlinks in a webpage.

Args

hyperlinks : str
The extracted hyperlinks from the webpage.
main : str
The main URL of the website.

Returns

dict
The number of internal and external hyperlinks.
def identify_keyword_density(source, search_query)

Function to identify the keyword density in the source code of a webpage.

Args

source : str
The source code of the webpage.
search_query : str
The search query containing the keywords.

Returns

float
The keyword density.
def identify_keywords_in_source(source, search_query)

Function to identify the number of occurrences of keywords in the source code of a webpage.

Args

source : str
The source code of the webpage.
search_query : str
The search query containing the keywords.

Returns

int
The number of occurrences of keywords.
def identify_keywords_in_url(url, search_query)

Function to identify the number of occurrences of keywords in a URL.

Args

url : str
The URL to check.
search_query : str
The search query containing the keywords.

Returns

int
The number of occurrences of keywords.
def identify_micros(source)

Function to identify microdata/microformats in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

list
The list of identified microdata/microformats.
def identify_nofollow(source)

Function to identify the number of nofollow links in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

int
The number of nofollow links.
def identify_og(source)

Function to identify if Open Graph tags are present in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

int
1 if Open Graph tags are present, 0 otherwise.
def identify_plugins(source)

Function to identify the plugins used in a webpage based on the source code.

Args

source : str
The source code of the webpage.

Returns

dict
The identified plugins.
def identify_robots_txt(main)

Function to identify if a webpage has a robots.txt file.

Args

main : str
The main URL of the website.

Returns

int
1 if a robots.txt file is present, 0 otherwise.
def identify_sitemap(source)

Function to identify if a sitemap link is present in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

int
1 if a sitemap link is present, 0 otherwise.
def identify_sources(main)

Function to identify the sources used in a webpage based on the main URL.

Args

main : str
The main URL of the website.

Returns

dict
The identified sources.
def identify_title(source)

Function to identify if a webpage has a title.

Args

source : str
The source code of the webpage.

Returns

int
1 if a title is present, 0 otherwise.
def identify_url_length(url)

Function to identify the length of a URL.

Args

url : str
The URL to identify the length of.

Returns

str
The length of the URL.
def identify_viewport(source)

Function to identify if a viewport meta tag is present in the source code of a webpage.

Args

source : str
The source code of the webpage.

Returns

int
1 if a viewport meta tag is present, 0 otherwise.
def identify_wordpress(source)

Function to identify if a webpage is built with WordPress based on the source code.

Args

source : str
The source code of the webpage.

Returns

int
1 if the webpage is built with WordPress, 0 otherwise.
def is_valid_url(url)

Function to check if a URL is valid.

Args

url : str
The URL to check.

Returns

bool
True if the URL is valid, False otherwise.
def match_text(text, pattern)

Function to check if a text matches a pattern using wildcard characters.

Args

text : str
The text to check.
pattern : str
The pattern to match against.

Returns

bool
True if the text matches the pattern, False otherwise.
def read_config_file(filename)
def save_robot_txt(main)

Function to get the content of a robots.txt file of a domain.

Args

main : str
The main URL of the website.

Returns

str
The content of the robots.txt file, or False if it cannot be retrieved.