Jobfront Help Center
  • Welcome to Jobfront
  • Job Boards
    • Getting Started
    • Sitemaps For Job Board Indexing
    • Set up Custom Sub/Domains (Email + Job board)
    • Add Google Analytics (and other analytics providers)
    • SEO Landing Pages
    • SEO Optimization - Google Index API Integration
    • Newsletter Landing Page
    • Show all tracked organizations
    • Subscriber Signup Embed Unit
    • Jobs Feed Embed Unit
    • Subscribers Upload API
    • "2-tap" Subscriptions - Auto-Populate Email Address Inputs
    • Slack Integrations
    • Terms of Service / Privacy Policy
    • Monetization - Subscriber $ Gate
    • Monetization - Sponsored jobs/newsletters
  • Job Post Data API
    • Getting Started
    • Jobs Data API
    • Signals API (GrowthFront)
    • Webhooks
    • XML Job Posts & Signals Feeds
  • Data Processing & Cleaning
  • AI Agents
    • In beta.
Powered by GitBook
On this page
  • Job Posts - XML Format
  • Root Elements
  • Job Entry
  • Formatting Notes
  • Intent Signals - XML Format
  • Root Elements
  • Signal Entry
  • Retrieving XML Files
  1. Job Post Data API

XML Job Posts & Signals Feeds

JobFront can provide multiple types of XML feeds for you with clean and formatted job post data

  • (Most common) Large XML feed for your overall, chronologically-ordered job feed

  • Different XML feeds per source JobFront tracks for you

  • Smaller XML feeds for different categories (job function, location, etc)

How it works

  • If you are ingesting different XML feeds for each source JobFront tracks for you (usually for job boards), once you get access to the JobFront data playground app you can search and retrieve different sources and see the XML feed link directly in the interface that you can copy into your own XML ingestion program

  • If you are ingesting a single large XML feed from many sources in chronological order (either for large job boards, or for sales intelligence tools), we will usually give you an API token that you can use to generate a signed S3 bucket url. You can then use that authenticated S3 bucket URL to download the XML feeds that we provide from S3

    • You will be able to retrieve the full active jobs index, along with delta files (removed and added jobs in the past day)

Job Posts - XML Format

<?xml version="1.0" encoding="UTF-8"?>
<jobs>
  <job>
    <id>Ja3f5322637f741200oeldu9c2ad2fbfe</id>
    <title>Sr. Director/Executive Director, Analytical Development</title>
    <description>Lead analytical development for late-stage cardiorenal small molecule programs</description>
    <post> FULL JOB POST HTML </post>
    <post_language>en</post_language>
    <url_job>https://job-boards.greenhouse.io/bridgebio/jobs/XXX</url_job>
    <work_model>on_site</work_model>
    <commitment>full_time</commitment>
    <level>expert</level>
    <benefits><benefit>401k participation</benefit></benefits>
    <requirements><requirement>Requires PhD</requirement></requirements>
    <responsibilities><responsibility>Work in the lab to develop various programs</responsibility></responsibilities>
    <salary>
      <min>60000</min>
      <max>80000</max>
      <currency>USD</currency>
      <period>year</period>
    </salary>
    <created>2025-04-03T12:37:45</created>
    <created_at>1743683865</created_at>
    <locations><location><text></text><city></city><state></state><country></country><zip></zip><latitude></latitude><longitude></longitude><cbsas><cbsa><title></title><code></code></cbsa></cbsas></location></locations>
    <tags><tag>Scientist</tag></tags>
    <source>
      <id>S1b6ad6b3d4ec4fa3ac7b5b98fa5c1486</id>
      <name>BridgeBio Pharma</name>
      <description>BridgeBio Pharma develops transformative medicines for genetic diseases and cancers with genetic drivers.</description>
      <domain>bridgebio.com</domain>
      <url_logo>https://jobfront-public.s3.us-west-2.amazonaws.com/sources/logos/S1b6ad6b3d4ec4fa3ac7b5b98fa5c1486.jpg</url_logo>
    </source>
  </job>
<jobs>

Root Elements

  • <?xml version="1.0" encoding="UTF-8"?>: XML declaration

  • <jobs>: Root element containing all job entries

Job Entry

Each job is enclosed in a <job> tag and contains the following fields:

Basic Job Information

  • <id> Unique identifier for the job

  • <title> Title of the job position

  • <commitment> Type of job commitment

    • Options: full_time, part_time, internship, contract, temporary, volunteer, other

  • <description> Brief 1-line description of the job

  • <post> Detailed job posting in HTML format

  • <post_language> Most likely 2-character language

    • Note: Most common language will be lowercase "en" - english but we do occasionally see jobs from dozens of other languages, indicated by their 2-character names

    • Note: In some cases job posts contain many languages, and we will provide a single predominate language

  • <level> Level of the job position (e.g., entry, senior)

    • Options: internship, entry_level, junior, mid_level, senior, expert ("" blank, if not detected)

  • <benefits> List of freetext benefits offered with the job

    • List of <benefit> tags with freetext benefits described in each

  • <requirements> List of requirements for the job

    • List of <requirement> tags with freetext requirements described in each

  • <responsibilities> Responsibilities associated with the job

    • List of <responsibility> tags with freetext responsibilities described in each

URLs

  • <url_job> Origin URL for the job post

Dates

  • <created> Date the job was seen (usually when we scraped this job for the first time, although sometimes we can determine that the job was posted in the past and we use that more precise historical timestamp)

    • Format: YYYY-MM-DD and UTC time

      • <created>2025-04-03T12:37:45</created>
  • <created_at> Timestamp when the job was created

    • Format: Unix timestamp

      • <created_at>1743683865</created_at>

Salary Information

<salary> Salary information enclosing the following subtags:

  • <min> Minimum salary

  • <max> Maximum salary

  • <currency> 3-character currency code

    • <salary><currency> - ['USD','AED','CZK','BTC', ...]
  • <period> Pay period (default is "year")

    • <salary><period> - ['year','month','week','biweek','day','hour','minute']

Locations

<locations> List of structured location information

  • List of <location> tags containing the following subtags:

  • <text> City, State, Country of the job location (includes all available entries)

  • <city> City of the job location

  • <state> State or region of the job location

  • <country> Country of the job location

  • <zip> Zip code of the job location

  • <latitude> Lat of the job location

  • <longitude> Long of the job location

  • <cbsas> List of <cbsa> tags that contain both cbsa <code> and <title>

    • <locations><location><cbsas><cbsa><code> - CBSA code
      <locations><location><cbsas><cbsa><title> - CBSA title

Source Information

  • <name> Name of the job source, organization, or company

  • <description> 1-line description of the job source

  • <domain> Naked domain of the source, for example “google.com”

  • <url_logo> Image/logo url hosted by JobFront

Formatting Notes

  1. All text content is escaped using XML entity encoding to prevent XML parsing errors.

  2. Numeric fields (like salary) are not enclosed in quotes.

  3. Empty fields are represented by self-closing tags or empty string values.

  4. Lists (like locations, tags, industries) are represented by multiple child elements within a parent element.


Intent Signals - XML Format

<?xml version="1.0" encoding="UTF-8"?>
<signals>
  <signal>
    <id>P000ca7e5-63b0-4e6e-8577-5180163f069e</id>
    <category>finance</category>
    <type>cash-flow-constraints</type>
    <recency>active</recency>
    <severity>medium</severity>
    <match>medium</match>
    <classification>inferred</classification>
    <text>The company appears to be facing challenges with financial liquidity and cash flow management.</text>
    <evidences></evidences>
    <jobs>
      <job><id>J9f918b793cce4e079dd06e39400ad203</id><title>Service Technician/ Diesel Mechanic</title><url>https://recruiting.ultipro.com/wdl1000/JobBoard/05fea415-1d4a-a06d-1bca-d0abf475250e/OpportunityDetail?opportunityId=7df7ed7f-fc58-4673-8235-e1c853184d6d</url></job>
      <job><id>J84e747df6bba4573a223df4a5d869409</id><title>Accounting Clerk</title><url>https://recruiting.ultipro.com/wdl1000/JobBoard/05fea415-1d4a-a06d-1bca-d0abf475250e/OpportunityDetail?opportunityId=53b33c5e-105f-4ce6-846a-1d32a8be1d19</url></job>
      <job><id>Jcc9d6dc40ab24e0f9b17c28776992bc6</id><title>Staff Accountant</title><url>https://recruiting.ultipro.com/wdl1000/JobBoard/05fea415-1d4a-a06d-1bca-d0abf475250e/OpportunityDetail?opportunityId=8ee5e341-b207-420e-aff3-b6e4323d8c70</url></job>
    </jobs>
    <created>2025-04-12T22:41:14</created>
    <created_at>1744512074</created_at>
    <start_at>1738790452</start_at>
    <recent_at>1740766553</recent_at>
    <source>
      <id>S9c4a6ed8d2a34fc2832d2eb42b27b2f7</id>
      <name>Allstate Peterbilt Group</name>
      <description>A dealership group offering sales, parts, and service for Peterbilt trucks and other commercial vehicles across multiple locations.</description>
      <domain>allstatepeterbilt.com</domain>
      <url_logo>https://jobfront-public.s3.us-west-2.amazonaws.com/sources/logos/S9c4a6ed8d2a34fc2832d2eb42b27b2f7.jpg</url_logo>
    </source>
  </signal>

Root Elements

  • <?xml version="1.0" encoding="UTF-8"?>: XML declaration

  • <signals>: Root element containing all signal entries

Signal Entry

Each signal is enclosed in a <signal> tag and contains the following fields:

Basic Signal Information

  • <id> Unique identifier for the signal (from problem_id)

  • <category> Category of the signal (from problem_category)

  • <type> Type of the signal (from problem_type)

  • <recency> Recency status of the signal (from problem_recency)

  • <severity> Severity level of the signal (from problem_severity)

  • <match> Match information for the signal (from problem_match)

  • <classification> Classification of the signal (from problem_classification)

  • <text> Descriptive text of the signal (from problem_text)

Evidence Information

Enclosed in <evidences> tags:

  • Multiple <evidence> tags: Each containing evidence supporting the signal

Associated JobsJobFront - XML Job Object Field Documentation V3JobFront - XML Job Object Field Documentation V3

Enclosed in <jobs> tags:

  • Multiple <job> tags, each containing:

  • <id>: Unique identifier for the job (from job_id)

  • <title>: Title of the job (from job_title)

  • <url>: URL for the job posting (from url_job)

Dates

  • <created>: Formatted timestamp when the signal was created (YYYY-MM-DDThh:mm )

  • <created_at>: Unix timestamp of when the signal was created

  • <start_at>: Unix timestamp of when the signal started (from problem_start_at)

  • <recent_at>: Unix timestamp of the most recent signal occurrence (from problem_recent_at)

Source Information

Enclosed in <source> tags:

  • <id>: Unique identifier for the source (from source_id)

  • <name>: Name of the source (from source_name)

  • <description>: Description of the source (from source_description)

  • <domain>: Domain URL of the source (from source_url)

  • <url_logo>: URL for the source's logo image (from source_url_logo)


Retrieving XML Files

XML feeds are stored in an S3 bucket. To retrieve the S3 bucket URL:

  • Aggregate feed URLS will be shared with you directly, as they are typically unique per customer

  • Per-source feed urls are available to retrieve via the app (when logged in)

Authentication

Most XML feeds require authentication. We enforce authentication via AWS Signed URL's to access the S3 buckets

  • JobFront will provide you with an API token

  • Submit your API token to our api to retrieve a signed download URL to your S3 bucket

Authorization: Bearer <api_token>
curl https://api.jobfront.io/v2/<endpoint> -H "Authorization: Bearer <api_token>"

where <endpoint> is:

/v2/xml/jobs - jobs
/v2/xml/signals - signals

More explicitly for both Jobs and Signals XML feeds, first authorize via the below API calls to retrieve available signed URLS:

curl https://api.jobfront.io/v2/xml/jobs -H "Authorization: Bearer <api_token>"
curl https://api.jobfront.io/v2/xml/signals -H "Authorization: Bearer <api_token>"

API Returns signed URL's for different XML files. Note that each field returns a list of URL's because for some accounts these files are too large (10's of GB) to manage and we split these datasets into multiple files.

{
    "signed_urls_all":["https://aws.amazon.com/..."],
    "signed_urls_added":["https://aws.amazon.com/..."],
    "signed_urls_removed":["https://aws.amazon.com/..."]
}

Please reach out to us and we'll get you set up hello@jobfront.io.

PreviousWebhooksNextData Processing & Cleaning

Last updated 1 month ago