Orion APA Citation 版本历史 - 3 个版本
Orion APA Citation 作者: Dasgeek
Orion APA Citation 版本历史 - 3 个版本
小心旧版本!显示这些版本是为了测试和参考目的。您应该始终使用附加组件的最新版本。
最新版本
版本 2.1
发布于 2026年5月17日 - 625.44 KB适用于 firefox 128.0 及更高版本PDF Metadata Overhaul
Focus: Accurate scraping of government, academic, and institutional PDFs
What's New
Two-stage PDF metadata extraction
Previous versions attempted to read PDF binary headers (Info dictionary and XMP) and fell back to URL parsing when those were empty. Many government and academic publishers , including NIST, NIH, DOJ, and most journal publishers deliberately leave binary headers sparse. All meaningful metadata (full title, authors, DOI, year) lives in the typeset cover page itself. v2.2 adds a second scraping stage that reads the rendered text from Firefox's PDF.js viewer DOM, which is where the real data is.
Cover page text parser (parsePdfPageText)
A dedicated parser now extracts from the rendered first-page text:
Title : scored by position, word count, and mixed-case detection to distinguish real headings from boilerplate like "JOINT TASK FORCE" or "This publication is available free of charge from:"
Year : context-aware matching tries explicit signals ("Published 2024", "March 2024") before falling back to any plausible 4-digit year, reducing false matches from document codes and version numbers
DOI : regex scan of cover page text catches printed DOI URLs (e.g. https://doi.org/10.6028/NIST.SP.800-53r5) that would never appear in a binary header; pre-fills the DOI field automatically
Author : checks for "Author:", "Prepared by:", and "By" labels; also recognizes common organizational body names (Joint Task Force, Computer Security Division, Information Technology Laboratory) for government documents without a named individual author
Smart metadata merging
Binary header data (when present) is treated as authoritative and takes priority. Cover page text fills any fields the binary stage missed. URL parsing is the final fallback. This ordering means the extension works well across the full range of PDFs: well-tagged academic papers, sparse government publications, and everything in between.
DOI auto-detection for all PDFs
Both the binary and text stages now scan for DOI patterns. When found, the DOI field is pre-filled and the full reference link resolves to https://doi.org/... automatically rather than the raw PDF URL.
Bug Fixes
Title truncation no longer breaks on colons
The previous version split the title on any : character, mangling titles like Security and Privacy Controls: for Information Systems and Organizations into just the first clause. The new logic only strips trailing | Site Name suffixes, leaving colons in legitimate titles intact.源代码遵循 Mozilla 公共许可证 2.0 发布
较早版本
版本 2.0
发布于 2026年5月16日 - 618.67 KB适用于 firefox 128.0 及更高版本Citation Engine & Bug Fixes
Date Hallucinations Cured: The scraper no longer mistakenly pulls random four-digit numbers (like statistics or future projections) from standard body text. It now strictly targets structured data blocks (JSON-LD), standard <meta> tags, and semantic <time> elements.
Intelligent Multiple Author Formatting: Fixed a bug that resulted in raw, poorly formatted strings with double commas. The engine now detects multiple authors, strips trailing punctuation, inverts names to proper APA format (Lastname, I.), and correctly joins them with an ampersand.
Corporate & Institutional Normalization: Added a global dictionary to catch and auto-capitalize standard organizational acronyms (e.g., ibm to IBM, pmi to PMI, cisa to CISA) when they are used as primary authors or site names.
UI & Stability
Blank Screen / Race Condition Resolved: Fixed a critical timing bug where the popup would occasionally render entirely blank. Initialization is now securely bound to a DOMContentLoaded event listener.
Graceful Error Handling: If a highly restricted website completely blocks the scraping script, the extension will no longer fail silently; it now displays a clear UI error message and safely falls back to parsing the URL domain structure.源代码遵循 Mozilla 公共许可证 2.0 发布