Skip to content

Commit

Permalink
Obter "prefix" a partir da resposta da API para Belo Horizonte-MG
Browse files Browse the repository at this point in the history
O argumento "prefix", na maioria da vezes é a data anterior o do Diário. Poré, em alguns casos, a data utilizada é a mesma do Diário. Ao invés de tentar descobrir o valor correto, essa informação está disponível na resposta da API do site.

Signed-off-by: Renne Rocha <[email protected]>
  • Loading branch information
rennerocha authored Dec 16, 2023
1 parent 5b7c8c8 commit e38ff29
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions data_collection/gazette/spiders/mg/mg_belo_horizonte.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import datetime
from datetime import timedelta
from urllib.parse import urlencode

import scrapy
import w3lib.url
from dateutil.rrule import DAILY, rrule

from gazette.items import Gazette
Expand Down Expand Up @@ -34,10 +34,12 @@ def parse(self, response, gazette_date):
gazettes = data["data"]
for gazette in gazettes:
is_extra_edition = gazette["tipo_edicao"] != "P"

prefix = (gazette_date - timedelta(days=1)).strftime("%Y%m%d")
gazette_hash = gazette["documento_jornal"]["nome_minio"]
gazette_url = f"https://api-dom.pbh.gov.br/api/v1/documentos/{gazette_hash}/download?prefix={prefix}"

prefix = gazette["prefix"]
if prefix is not None:
gazette_url = w3lib.url.add_or_replace_parameter(gazette_url, "prefix", prefix)

yield Gazette(
date=gazette_date,
Expand Down

0 comments on commit e38ff29

Please sign in to comment.