Recent Releases of flowsa
flowsa - v2.1.0
Major changes:
Implemented ability to calculate and track data quality (DQ) scores for FBS
- DQ scores for Data Collection, Data Reliability, Geographical Correlation, Technological Correlation, Temporal Correlation
- DQ scores based on EPA's Guidance on Data Quality Assessment for Life Cycle Inventory Data
- DQ scores are beta version in this release and not included in all FBS methods
- All FBS methods include scores for Geographical Correlation, Technological Correlation, and Temporal Correlation
- Full DQ implementation will require updating FBA methods with Data Collection and Data Reliability scores
- New functions added to calculate DQ
- New
adjust_dqi_reliability_collection_scores()to modify data reliability and data collection based on source and target sector levels assign_temporal_correlation()assigns temporal DQ based on difference between year of data and target year of FBSassign_geographical_correlation()assigns DQ for geoscale based on data geoscale vs target FBS geoscaleassign_technological_correlation()assigns DQ scores based on difference between source and target sectors
- New
Modified how activities are mapped to sectors to enable proper accounting for Technological Correlation scores, which are based on the difference between the original activity to sector mapping and the target sector level.
- First map all activities to the sector year identified in data crosswalk, then later convert to target sector year. Previously we immediately converted the crosswalk to target sector year, before matching on activities
- We modified how NAICS are converted between NAICS years
- We originally mapped all activities to NAICS6+ in the activity to sector crosswalk, then converted between NAICS years, then aggregated to target sector level, then merged NAICS to the activity-based data sets. This method is problematic when assigning DQ scores and unnecessary for FBS methods that are generated for more aggregated sector levels
- Now to convert, we map the activities to the original sector year associated with that data. We then identify how many child NAICS there are (at NAICS 6) for each of the sectors and determine how many of those child sectors are converted to new sectors for the target sector year in
generate_naics_crosswalk_conversion_ratios()and proportionally attribute the sectors to the new sectors for the target sector year.- For example, if we are converting NAICS4 across years, we identify all child NAICS6 for each NAICS4 and determine how those NAICS6 map between years. If there are five child NAICS6 and one child NAICS6 maps to a different parent NAICS4 in the target year, than 1/5 of the original NAICS4 parent value is mapped to a different NAICS4 in the target year
- Conversion is not based on numeric values within the FBS because we might only have NAICS4 values, not NAICS6 and therefore do not have the data to create proportional conversions
- New
subset_sector_key()- Subsets sector key to return sector/industry that most closely maps activity/source sectors to target sectors – drops parent sectors within crosswalk and assigns tech corr scoring, modifies DatarReliability and DataCollection scores based on mapping
- Modified NAICS conversion data check - originally checked if a sector-like activity was found in any NAICS year outside of the target year and if so, mapped to target year. This function did not always map correctly because the sector could be found in multiple NAICS years, and the NAICS years map differently to target year sectors
- Revised function to check for the closest NAICS year to the target year and use that year to map to target NAICS
Updated default NAICS year in Employment FBS to NAICS 2017 (revised from NAICS 2012)
- Changing NAICS year impacts the results of all FBS that use employment FBS as an allocation source
- Previously, some BLS QCEW data were imported as NAICS 2012 left as NAICS 2012, while other data years were imported as NAICS 2017 and converted to NAICS 2012
- Now, some BLS QCEW data are imported as NAICS 2012 and converted to NAICS 2017, while other data years are imported as NAICS 2017 and left as NAICS 2017
- Many of these Employment datasets published as NAICS 2017 are later converted back to NAICS 2012 for use as allocation sources in other FBS methods. A conversion from NAICS 2012 -> NAICS 2017 -> NAICS 2012 occurs, which changes the employment results based on our conversion functions, resulting in changes to those relevant flows in the FBS methods.
Modified how data are merged on location so we can correctly merge state with county data
Minor changes:
- Correct error in
attribute_flows_to_sectors()- Original grouptotal assignment was based on original df FlowAmount values, but we reset the index, so needed to base grouptotal on new index of the df
- Adds FIPS scale (1,3,5) to FIPS_Crosswalk
- Add NAICS 2002, 2007, 2022 crosswalks
- Expand NAICSCrosswalkTimeSeries to include NAICS 2022
- New NAICSYearConcordance which maps published 6-digit sectors across years
- New Sector_Levels .csv which labels sector level and sector length for all sectors
- Update BLSQCEW NAICS years for 2011, 2022, and 2023 in sourcecatalog.yaml
- BLS QCEW
estimate_suppressed_qcew()- Update the function to only estimate suppressed data up to max sector level. No longer estimate suppressed 6-digit sectors, when our target is 3-digit
- Consistent fips scale assignments. National = 5, state = 2, county = 1
- url updates to government FBA links
FBA Changes
- Generates new FBAs for EPA GHGI for 2019-2023
- Updated BEA FBAs (Supply, Use, GrossOutput) for 2012-2023
FBS Changes
- Updates to GHG FBS national (m1 and m2) for 2019 - 2023; drops 2012 - 2018 FBS which no longer will work with the latest FBAs
- New FBS method: Wages_national for 2017
- Updates Use and Supply tables in SUT format (see #453)
Includes PR:
441
452
453
455
456
Full Changelog: https://github.com/USEPA/flowsa/compare/v2.0.6...v2.1.0
- Python
Published by catherinebirney 9 months ago
flowsa - v2.0.5
What's Changed
- Updates StateGHGI FBS for 2024 release, includes updated StateIO FBAs, in https://github.com/USEPA/flowsa/pull/442
- Add 2020 census data set for urban/rural splits in https://github.com/USEPA/flowsa/pull/444
- Expands educational attainment and adds school enrollment to
Census_ACS
New FBAs
- stateiousesummary (2012 - 2023)
- EPA_StateGHGI (2012 - 2022)
New FBSs
- GHGstatem1 (2012 - 2022)
Full Changelog: https://github.com/USEPA/flowsa/compare/v2.0.4...v2.0.5
- Python
Published by bl-young about 1 year ago
flowsa - v2.0.4
What's Changed
- Census Service Annual Survey in https://github.com/USEPA/flowsa/pull/421
- BEA Personal Consumption Expenditures by state in https://github.com/USEPA/flowsa/pull/420
- Census FBA datasets in https://github.com/USEPA/flowsa/pull/427
- Revised state level GHG data for CBEI in https://github.com/USEPA/flowsa/pull/428
- employment updates in https://github.com/USEPA/flowsa/pull/437
New FBAs
- Personal consumption expenditures by state (
BEA_PCE) - BLS Consumer Expenditures Survey (
BLS_CES) - Census American Community Survey (
Census_ACS) - Census County Business Patterns (
Census_CBP), revised - Census Economic Census, Class of Customer Statistics (
Census_EC) - Census Service Annual Survey (
Census_SAS) - State Inventory Tool (
EPA_SIT); requires state data - GHG Inventory data for select states to support EPAs Consumption Based Emissions Inventories
- Updated USDA ERS Farm Income and Wealth Statistics (
USDA_ERS_FIWS) - Updated USDA ERS Farm Income and Wealth Statistics (
NOAA_FisheriesLandings)
New FBSs
- Stateemploymentm1 (added 2021-2023, updated all other years)
- Nationalemploymentm1 (added 2023, updated all other years)
Full Changelog: https://github.com/USEPA/flowsa/compare/v2.0.3...v2.0.4
- Python
Published by bl-young over 1 year ago
flowsa - v2.0.3
Flow By Activity
- Updates EPA_GHGI through 2022 (2012-2022) in https://github.com/USEPA/flowsa/pull/406
- New BEA data for 2012-2022 (Summary & Gross Output), 2012 & 2017 Detail (using 2017 BEA schema)
- New CoA data (2022) and updated USGSMYBLead (2020) in https://github.com/USEPA/flowsa/pull/405
Flow By Sector
- BEA_Detail FBS for 2013-2016, 2018-2022
- Updated GHG national FBS (m1 and m2) 2012-2022
Full Changelog: https://github.com/USEPA/flowsa/compare/v2.0.2...v2.0.3
- Python
Published by bl-young almost 2 years ago
flowsa - v2.0.2
What's Changed
- add source publication dates to FBAs in https://github.com/USEPA/flowsa/pull/275
- option to specify git version/hash when returning an FBA via
git_versionin https://github.com/USEPA/flowsa/pull/399 - FBS metadata captures sequential FBAs in https://github.com/USEPA/flowsa/pull/399 (resolves #397)
- option to generate FBS that contains activity cols (
retain_activity_columns=True) and sector name cols (append_sector_names=True) in https://github.com/USEPA/flowsa/pull/398 - Update 2017-2022 employment FBS in https://github.com/USEPA/flowsa/pull/410
- Updates Energy based datasets in https://github.com/USEPA/flowsa/pull/411
- add national CRHW methods in https://github.com/USEPA/flowsa/pull/414
- Global Materials Database in https://github.com/USEPA/flowsa/pull/415
- Enables calling multiple years at once for generating FBAs using
call_all_years: True(#407) - Allows skipping of
standardize_units(#408) - Limit numpy < 2.0.0 (see #418)
Flow-by-Activity
- Substantial updates to IEA Monthly Energy Review (
EIA_MER) - Adds UNEP Global Materials Flow Database (
UNEP_IRP_GMFD)
Flow-by-Sector
- Updates
Employment_nationalto 2017 NAICS schema, and adds 2021 and 2022 (#410) - Updates
CRHW_nationalto 2017 NAICS schema, and adds 2021 (#414) - Adds
Energy_fossil_national - Adds
Raw_Material_Extraction_national
Full Changelog: https://github.com/USEPA/flowsa/compare/v2.0.1...v2.0.2
- Python
Published by bl-young almost 2 years ago
flowsa - v2.0.1
- new USEEIOv2 detail target schema
- fix broken FBAs (changed urls, changed excel tab names) for EIAAEO, NOAAFisheriesLandings, EIASEDS, EPACDDPath, USGSMYB, EPAWARMer
- reassign USDAIWMS '111333' NAICS code to Berry Totals (from Orchards) to align with USDACoA_Cropland assignment c7c4c4f
- assign USGSWUCoef "Beef and other cattle" to "11213" in addition to "11212" a9c01c4
- update method_status.md to reflect current status of FBA/FBS errors when generating
- updates to
stackedBarChart()to work in situations where df is already a collapsed FBS and where there are unique input parameters - updates to
FBSscatterplot()- add boxplot option - update
sector_aggregation()to work for collapsed FBS df - generalize
return_primary_activity_column()toreturn_primary_flow_column()so function works for both FBA and FBS 46ddf48 - new
proxy_sector_data()to enable substituting an FBS sector value for a missing sector - correct zenodo authorship
- update links to new data commons server
- rename "fosslandings.csv" to "NOAAFisheriesLandings.csv" in external data folder
- update waste sector names, add 2 additional waste sectors 854eab3
- add warning when an FBS method uses "direct" when it should use "equal"
- update Land and Water FBS to use "equal" over "direct"
- edit log statements to make more concise
Changes to FBS
- Waternational2015m1, Waternational2010m1, Waterstate2015m1 have new results to do reassigned activity to sector mapping for USGSIWMS and USGSWUCoef
Full Changelog: https://github.com/USEPA/flowsa/compare/v2.0.0...v2.0.1
- Python
Published by catherinebirney over 2 years ago
flowsa - v2.0.0
Major updates:
- Turn FlowByActivity and FlowBySector into classes
- Create FlowBy class for functions used in both FBA and FBS classes
- FBS yamls revised to work for unlimited recursive and sequential attribution methods
- Update how suppressed data and parent sectors are equally attributed to child sectors, by equally attributing parent values to the next level of child values, rather than equally attributing parent values to target-level child sectors
- Ability to attribute dfs on non-sector columns
- Option to fill in primary source data columns with attribution columns
- Add support for 2017 NAICS codes
- Add mappings for 2017 BEA codes to 2017 NAICS
- All state FBS model results sum to national FBS model results
Minor Updates:
- Updates to Paths, which require most up-to-date versions of fedefelmflowlist, esupy, stewi
- Sourcecatalog.yaml updated - 'activityschema' can be year-dependent
- New function comparenationalstate_fbs() which compares aggregated results of state vs land FBS
- New github action to generate a single FBA
- Change BEA FBA names to align with useeior naming schema
- Updates to BLM FBAs - adds previously dropped state and national data for select activities
- Rename "CAPHAPnational2017" to "CAPHAPnational2017_m1"
- Add D.C. data to ERS MLU and NWIS WU
- Drop support for Python 3.8, add support for Python 3.11
- Rename sectoraggregationlevels from "aggregated" and "disaggregated" to "flat" and "Parent-completeChild" and "parent-incompleteChild"
New Flow-By-Sector Models
- CAPHAPNonpoint 2014, 2017, 2020
- CAPHAPNonroad 2014, 2017, 2020
- CAPHAPOnroad 2014, 2017, 2020
- CAPHAPnational m1 and m2 2014, 2017, 2020
- CAPHAPstate m1 2014, 2017, 2020
- CRHW_state 2013, 2015, 2019
- Detail Make, Supply, and Use tables
- Employment national 2002, 2016, 2019, 2020
- GHG national m1 and m2, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
- GHG state m1 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
- GRDREL national 2020
- GRDREL state 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
- Landstate2012
- TRIDMRstate 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
New Flow-By-Activity Models
- Annual BEA summary make and use tables
- Bureau of Transportation Statistics Airline fuel cost and consumption (BTS_Airlines) 2000-2021
- EPA State Inventory Tool (EPA_SIT) 2018-2019
- EPA State GHG Inventories (EPA_StateGHGI) 2020
- State GHGI for Maine (1990-2019), New York (2017-2019), Vermont (1990-2019)
Justifications for changes in FBS model results
- All FBS models have revised results
- Updates to how suppressed data are estimated and to how parent sector values are equally attributed to child sectors impacted all FBS models
- CNHWnational2018, CNHWstate2014 - changes due to employment FBS; fix error in missing F01000
- Foodwastenational2018m2 - changes due to new CNHW 2018 as primary data source
- Waternational2015_m1 - Use state employment data for attribution instead of national employment data and reassigned an "Orchard" code to "Berry Totals"
Features removed:
- Generating Sankey diagrams (will be re-added in future version)
- Producing .bib files (will be re-added in future version)
- Appending material codes to sector codes (will be re-added in future version)
- Drop support for Waternationalm2 FBS
- Drop support for Electricitygenemissions FBS
What's Changed
- Seea recursive by @matthewlchambers in https://github.com/USEPA/flowsa/pull/243
- Adds display_tables() method to FBS by @matthewlchambers in https://github.com/USEPA/flowsa/pull/246
- allocate regional mecs data to states by @bl-young in https://github.com/USEPA/flowsa/pull/225
- update state ghg with develop by @catherinebirney in https://github.com/USEPA/flowsa/pull/257
- wrapped code in a function call so it won't all execute on import by @matthewlchambers in https://github.com/USEPA/flowsa/pull/289
- Override .astype() from DataFrame to fix issues introduced by pandas 1.5.0 by @matthewlchambers in https://github.com/USEPA/flowsa/pull/288
- Removes unused keys from certain FBS methods by @matthewlchambers in https://github.com/USEPA/flowsa/pull/291
- allow cached datasets to undergo futher selection by @bl-young in https://github.com/USEPA/flowsa/pull/294
- Add argument to FB constructor to prevent (when necessary) adding columns by @matthewlchambers in https://github.com/USEPA/flowsa/pull/295
- Pull master into state_ghg branch by @ericmbell1 in https://github.com/USEPA/flowsa/pull/299
- Several small fixes by @matthewlchambers in https://github.com/USEPA/flowsa/pull/312
- Document flowby functions by @catherinebirney in https://github.com/USEPA/flowsa/pull/331
- Update GHGI method to v2.0 by @bl-young in https://github.com/USEPA/flowsa/pull/332
- Update saving of file and metadata by @bl-young in https://github.com/USEPA/flowsa/pull/337
- update recursive-refac with sector_mapping (includes up-to-date develop branch) by @catherinebirney in https://github.com/USEPA/flowsa/pull/330
- drop attributionsources col in substitutenonexistentvalues() by @catherinebirney in https://github.com/USEPA/flowsa/pull/338
- Updates to support state methods on recursive branch by @bl-young in https://github.com/USEPA/flowsa/pull/339
- Seea time series by @matthewlchambers in https://github.com/USEPA/flowsa/pull/341
- expand exclusion_fields by @catherinebirney in https://github.com/USEPA/flowsa/pull/343
- update industryspeckey() so it is not based on sector string length by @catherinebirney in https://github.com/USEPA/flowsa/pull/342
- Drop deprecated code and revise log handling by @bl-young in https://github.com/USEPA/flowsa/pull/347
- update NEI nonpoint data with 2020 by @bl-young in https://github.com/USEPA/flowsa/pull/348
- minor updates to support CAP_HAP FBS by @bl-young in https://github.com/USEPA/flowsa/pull/349
- specify NAICS year in activity_schema by @catherinebirney in https://github.com/USEPA/flowsa/pull/345
- Refactor with pathlib by @bl-young in https://github.com/USEPA/flowsa/pull/350
- Update cnhw by @catherinebirney in https://github.com/USEPA/flowsa/pull/352
- update recursive-refac with proportional_attribution by @catherinebirney in https://github.com/USEPA/flowsa/pull/354
- incorporate parent-incompleteChild sector hierarchy into maptosectors() by @catherinebirney in https://github.com/USEPA/flowsa/pull/355
- Gha error2 by @catherinebirney in https://github.com/USEPA/flowsa/pull/356
- Add action to create an FBA by @bl-young in https://github.com/USEPA/flowsa/pull/357
- update methods to estimate suppressed data by @catherinebirney in https://github.com/USEPA/flowsa/pull/360
- Set FBA columns and column dtypes by @catherinebirney in https://github.com/USEPA/flowsa/pull/362
- Split flowby.py into 3 python scripts by @catherinebirney in https://github.com/USEPA/flowsa/pull/363
- Updates to GHG state method by @bl-young in https://github.com/USEPA/flowsa/pull/365
- New GHG FBS method files by @ysrivas08 in https://github.com/USEPA/flowsa/pull/366
- update develop with recursive-refac (flowsa v2.0) by @catherinebirney in https://github.com/USEPA/flowsa/pull/364
- Bea refac by @catherinebirney in https://github.com/USEPA/flowsa/pull/372
- State ghgi v2 by @catherinebirney in https://github.com/USEPA/flowsa/pull/374
- update develop with changes to harmonize state and national fbs values by @catherinebirney in https://github.com/USEPA/flowsa/pull/375
- update flowsa2.0-release branch with develop by @catherinebirney in https://github.com/USEPA/flowsa/pull/376
- Flowsa2.0 release by @jbousquin in https://github.com/USEPA/flowsa/pull/381
- Update flowsa2.0 with updates from reviewer comments by @catherinebirney in https://github.com/USEPA/flowsa/pull/382
- 2017 Supply and Use tables from useeior by @WesIngwersen in https://github.com/USEPA/flowsa/pull/383
- update develop with Flowsa2.0 release branch by @catherinebirney in https://github.com/USEPA/flowsa/pull/386
- update state ghgi with develop/flowsa2.0-release branch by @catherinebirney in https://github.com/USEPA/flowsa/pull/387
- skip
select_by_fieldswhen using activity_sets by @bl-young in https://github.com/USEPA/flowsa/pull/392 - Revised approach to GHG models by @bl-young in https://github.com/USEPA/flowsa/pull/378
- Updates to generate 2014 CAPHAPNonroad without memory errors by @bl-young in https://github.com/USEPA/flowsa/pull/393
- update 2.0-release branch with changes from develop by @catherinebirney in https://github.com/USEPA/flowsa/pull/394
- Flowsa2.0 release by @catherinebirney in https://github.com/USEPA/flowsa/pull/367
New Contributors
- @ysrivas08 made their first contribution in https://github.com/USEPA/flowsa/pull/366
- @jbousquin made their first contribution in https://github.com/USEPA/flowsa/pull/381
v2.0.0 reviewers
Thanks to David Graham and Justin Bousquin for reviewing FLOWSA for the v2.0.0 release.
Full Changelog: https://github.com/USEPA/flowsa/compare/v1.3.2...v2.0.0
- Python
Published by catherinebirney over 2 years ago
flowsa - v1.3.1
FLOWSA v1.3.1 release coincides with supply-chain-factors v1.2 release
Greenhouse Gas (GHG) Flow-By-Sector (FBS) Method changes: - Updates GHG FBS m1 to equally allocate BEA and EIA MECS to sectors rather than use employment for attribution - Updates GHG FBS m1 for 2016, 2017, 2018, and 2019 with latest inventory - Use a common GHG FBS m1 yaml file as basis for all years - Update some GHG FBS attribution source data years (MECS) - Updates to GHGI activity names and activity to sector mapping
New Flow-By-Activity and Flow-By-Sector datasets: - Adds 2020 GHGI FBA and GHG national FBS
Additional, minor changes: - Update stackedbarchart() to use colors defined in visualizationessentials.csv, to include option to specify target sector level and to generalize attribution methods (direct vs attributed)
- Python
Published by catherinebirney about 3 years ago
flowsa - v1.3.0
Major Updates: - Option to append material codes to end of sector code in FBS - New waste-related FBAs/FBSs, and 7/8 digit sector codes - New food waste and concrete waste specific FBS - New data visualization functions: Sankey, stackedBarChart - Updates to methodology since v1.2.4 release changed the results for: CAPHAPnational2017, CNHWnational2014, Landnational2012, Waternational2010m2, Waternational2015m2, Waternational2015m3 - option to return an FBS with any mix of sector lengths, SPB and SCB columns no longer need to have matching sector lengths - option for multiplication as allocation method - update allocation methodology so if the allocation dataset is at a more aggregated geoscale than the primary FBA, that the primary df is not aggregated to match the allocation geoscale. Instead the more aggregated allocation dataset merges with the primary FBA on all related less aggregated geoscales - Changes to github actions to address memory issues - in equal_allocation() aggregate column before allocating to child naics - this impacts some NAICS6 results - option to retain activity names in a primary data source after calling on those activities in an FBS, so the activity names can be used again - requires plotly, kaleido
Minor Updates: - new externalpaths.env file to store local file paths - rename loadapikey to loadenvfilekey(), used for api keys and externalpaths.env - getFlowByActivity accepts flowclass of class or list - new sector codes: S00203 - other state and local government enterprises, F040 - exports of goods and services, F050 - imports of goods and services, new 7/8 digit sectors for waste sectors (5622121, 5622191, 5622192, 5629201, 5629202, 5629203) - New required columns in FBS: ProducedBySectorType, ConsumedBySectorType, AttributionSources (primary attribution source name) - new selectionfields FBS parameter to subset FBA using column names/values - new VisualizationEssentials.csv with standard colors for graphing sectors - new csv "sector2012names" which includes additional names beyond official NAICS
New/Modified FBA: - expand epacddpath sector mapping - EPAFactsAndFigures, EPAREI, EPAWARMer, EPAWFR, CensusASM, EPACDDPath (2018), EIAAEO, EIA_SEDS
New/Modified FBS: - CNHWnational2018, Employmentnational (2012, 2014, 2015), Foodwastenational2018 (m1, m2), GHGnationalm1 (2016, 2017, 2018, 2019), REIwastenational_2012
- Python
Published by catherinebirney over 3 years ago
flowsa - v1.2.4
FBA Updates
- USDAERSFIWS through 2020
- NOAA_FisheriesLandings through 2021
- BLS_QCEW data through 2021 ### FBS Updates #255
- Move most employment FBS text to Employment_common.yaml
- New Employmentnational2018
- New Employmentstate2020 ### BEA updates from useeiorv1.1.0 #230
- Updates 2012 Make and Use tables to use latest BEA release in Fall of 2021 (see https://github.com/USEPA/useeior/pull/221)
- Updates industry gross output data from same release, and expands to include 2019 and 2020 data years
- Removed deprecated
BEA_2012_Detail_Use_Industry_Transactions.csv - Changes to BEA data modify results of CAPHAPnational2017, Waternational2010m2, and Waternational2015_m2
### Other updates
- Correct metadata URLs for FBA and FBS YAMLs #262
- Old metadata URLs can be corrected by replacing "data" with "methods"
Full Changelog: https://github.com/USEPA/flowsa/compare/v1.2.3...v1.2.4
- Python
Published by catherinebirney almost 4 years ago
flowsa - v1.2.3
Flow-By-Activity Updates
- Fixed broken link for USGSMYBCopper #252
- Add BLS_QCEW 2019 FBA #248
- Add method_status.yaml to document list of known inactive method files #251
- New Census_QWI FBAs for 2002, 2010 - 2020 #242
Flow-By-Sector Updates
- Add state employment FBS for 2018/2019 #248
Requirement Updates
- Update to pandas 1.4.0 #238
- Drop support for Python 3.7 (due to pandas update) #237
GitHub Actions
- Separate actions for FBA config and FBS method testing #251
- In FBA testing, skip inactive FBA methods identified in method_status.yaml #251
Full Changelog: https://github.com/USEPA/flowsa/compare/v1.2.2...v1.2.3
- Python
Published by catherinebirney almost 4 years ago
flowsa - v1.2.2
Flow-By-Sector Updates
- Use useeio summary sector target levels for
Water_state_2015_m1andCNHW_state_2014(#226), disaggregates utilities sector (221) - Fixes error in
CNHW_stateandCNHW_nationalintroduced by change in BLS allocation data
Flow-By-Activity Updates
New FBA: - EPAStateGHG (#231) (source data file not yet available publicly) - EPASIT (#231) - stateior (#231)
Other Updates:
- Added years 2018-2020, and fixed missing data for Census_CBP (#223, #229)
Functionality Additions
- new BEA summary crosswalks and functions to support analysis including
map_to_BEA_sectors()andcalculate_industry_coefficients() - improved error handling and custom exceptions (#216)
Full Changelog: https://github.com/USEPA/flowsa/compare/v1.2.1...v1.2.2
- Python
Published by bl-young almost 4 years ago
flowsa - v1.2
Flow-By-Sector Updates
New FBS: 1. CNHWstate2014 2. CRHWstate2017 3. Employmentstate 2012-2017 4. GRDRELstate2017 5. TRIDMRstate2017
FBS Updates/Changes to methodology: 1. Addressed errors in the Waternational FBS methodology a. The USGSNWISWU data is modified to distinguish between crop and golf irrigation. The function `checkgolfandcropirrigationtotals()` checks if a state distinguishes between crop/golf irrigation and if not, assigns all flows to crop irrigation. An error in the function resulted in several states not having total irrigation water assigned to irrigation crop water withdrawals. The correction increased the total amount of water withdrawn for crop irrigation and resulted in a redistribution of water by sectors. b. Previous versions allocated "Livestock" water use to aquaculture sectors, resulting in double counting of water use for aquaculture, as aquaculture is a separate water use category. The updated methodology drops aquaculture from USDA CoA cropland data before allocating to sectors, removing the double counting.
Addressed error in Landnational2012 methodology. In the previous version, land use for aquaculture was only accounted for in 2 states. This version accounts for land use in all states, resulting in an increase in land use for aquaculture and a decrease in land use for other sectors in the animal land use category.
Refactors CAPHAPnational2017.yaml to split activity set 3 into separate activity sets, such that the subsetting columns from the NEINonpoint2017asets.csv file are no longer necessary. Removes the subsetting columns from NEINonpoint2017_asets.csv (#213)
Adjust assignment of NAICS to facilities for stewicombo data: use NAICS from FRS list only if a single NAICS is provided, else use the NAICS assigned by the inventory; reduces risk of inconsistent NAICS application (#151)
Changes to FBS method YAML file setup:
1. Added a flowsa_yaml module including a custom yaml loader called FlowsaLoader which implements 4 tags that can be used in FBA and FBS method yaml files.
a. !include: allows inheriting arbitrary nodes from other yaml files
b. !external_config: adds additional target paths to search for included files (#207)
c. !from_index: specifies the activity set file to load (#210)
d. !script_function: specifies the python file and function name to load in FBA and FBS yaml files (#214)
Refactors FBA method yamls, along with their associated csvs, to use descriptive activity set names, rather than just activityset1, etc.
Add a NAICS dictionary relevant for BEA Summary level models which can be accessed using
!include:BEA_summary_target.yamlin any FBS method (#197)New parameter
activity_to_sector_mappingto specify the name of the sector mapping file to use, rather than loading a mapping file that matches the source name (#186)
Flow-By-Activity Updates
New FBA: 1. BTS TSA FBA (Transportation Satellite Account data) 2. USGS MYB Lead 2018/19 3. EPA EQUATES FBA
FBA Updates/Changes to Methodology: 1. BLSQCEW: updated the FBA FlowNames to distinguish ownership (e.g., federal government). This change enables better estimates of suppressed data, which impacts results of all FBS methods that use employment as an allocation source. 2. USDACoACropland: updated the FlowName for irrigated, harvested cropland (111) to match the FlowNames that aggregate to that value. This change does not impact FBS results. 3. Updated GHGI FBA column assignments for flows and activities in some tables. 4. Fixed broken Census PEP Population links and added data for 2018-21 5. Condensed all USGSMYB python scripts into single file, USGSMYB.py 6. Fixed misaligned coordinates for EIAMECSEnergy2018
Updates to Crosswalks/Mapping Activities to Sectors:
- Expand NAICS2012Crosswalk and NAICSCrosswalkTimeSeries to include 7-digit NAICS
- Removed NAICSCrosswalkEPA_GHGI.csv file. Will re-add to master branch once the crosswalk is finalized.
- Updated NAICSCrosswalkUSDACoALivestock.csv to drop unused NAICS8 for cow categories and correct the "Pheasants" sector assignment
- Update NAICSCrosswalkeLCI.csv to modify electricity sector assignments from NAICS6 to NAICS7
Functionality Additions:
- Option to allow Flow-By-Sector method YAMLs (and crosswalks and activity set files) to be kept outside of the FLOWSA package in a local directory and passed for processing in a
getFlowBySector()call. This will allow development of FBS methods outside of the main flowsa repo. (#172, #178) - Ability to include multiple sector levels within a single FBS dataset, see "flowbysectormethods/BEAsummarytarget.yaml" for an example. (#181)
- Option to split data into rural/urban classification by FIPS (#173)
- Modify method of reading sector crosswalks to use the sector length identified in the column header rather than the length of the text string. This change enables flexible assignments to each sector length (e.g., 2-digit government sector is F010)
- New functionality "Process adjustments" which allow for adjustments to the "SectorProducedBy" field for data obtained from stewicombo. Records are reassigned to the "target_naics." (#190)
Notable Modifications to Functions:
- Updates
equally_allocate_suppressed_parent_to_child_naics()to allocate data from the nearest unsuppressed parent sector rather than allocating from the nearest parent sector level that contained no suppressed data. This change results in more accurate data allocation, where child sectors do not sum to be greater than parent sectors (unless the way data was rounded in the original df results in some minor differences). Also add validation checks within the function to compare original flow amounts at each sector level to the flow amounts in the revised df. A second validation check compares summed child sectors to parent sector values. - Updates
disaggregate_pastureland()anddisaggregate_cropland()so when a parameter is indicated to be dropped (like 1125 (aquaculture) in the water FBS method), that any parent sectors are also dropped, so the df is re-aggregated and outputs more accurate allocation ratios.
- Python
Published by catherinebirney about 4 years ago
flowsa - v1.1
New Features: - Data visualization functions - GitHub actions testing for python 3.7 - Allow for .yaml files to recursively inherit other .yaml files. Keys in children will overwrite the same key from a parent. - Incorporate yaml anchors to simplify yamls method files by reducing duplicated information - Expand functionality of the FBA .yaml files to read in dictionary of relevant data table information to reduce hard coding in source.py files or requiring additional .yaml files with year-specific information - Add additional BEA government codes to stand in as sectors (S00101 - Federal electric utilities, S00202 - State and local government electric utilities)
Changes to functions:
- Simplified all functions that load a specific dictionary into one function
- Remove the duplicated information in function convert_fba_unit() and instead use standardize_units()
- Rename find_true_file_path() to get_flowsa_base_name()
- url_replace_fxn(), call_response_fxn(), and parse_response_fxn() now accept only keyword arguments, removing need for particular order for passing arguments
- The subset of functions that begin with (urlreplace, callresponse, and parseresponse) now accept any number of keyword arguments, so now list only the set of argument that they actually need
- move make_url_request() to EPA's esupy repository for use in other EPA LCIA tools
Minor Modifications:
- Moved FBA/FBS methods out of data/ to separate methods/ directory
- Modified code to PEP 8 styling
- Remove memory cache when running FBAs
- Rename all activity to sector mapping files to standardized naming convention NAICS_Crosswalk_DATASOURCE.csv
- Update broken USGS links for MYB Zeolites and Gallium
New Contributors: @matthewlchambers
- Python
Published by catherinebirney over 4 years ago
flowsa - v1.0.1
This update addresses activity to sector mapping discrepancies that impact three FBS:
Waternational20XX_m1: - addresses dropped sector in crop irrigation allocation (111940) - removes thermoelectric allocation to wind (221115) - drops allocation of crop irrigation for pastureland to aquaculture (1125) because aquaculture attributed separately
Waternational20XXm1 and Landnational_2012 - adds orange groves and noncitrus groves to CoA and IWMS crosswalk for orchards (11131 and 11132)
Waternational20XX_m2: - adds additional NAICS to Blackhurst IO crosswalk (21232 and 21239) and modifies activity to sector mapping for sectors 212210 and 213112
- Python
Published by catherinebirney over 4 years ago
flowsa - v1.0.0
- Address duplication in USDAERSMLU (changes results of Landnational2012 FBS)
- Use a .env file to hold the API_Keys instead of .txt files (add .env to .gitignore), add an example .env file to 'examples' directory
- Use git describe to return package version number for use in meta and filename instead of manually updating a version number in common.py
- Adds a memory cache using 'joblib' so can quickly regenerate FBAs if url has already been called
- Two new package requirements joblib (for url memory cache) and python-dotenv (Reads .env files)
- New function "equalallocation" that is called on in the "directallocation" method which will check if an Activity is being allocated to more than one sector and if so, will equally allocate the Activity FlowAmount to all sectors
- Add additional unit conversions in standardize_units()
- Address changes in USGS MYB urls
New Contributors: Thanks to Andy Chase and David Graham for reviewing FLOWSA for the v1 release.
- Python
Published by catherinebirney over 4 years ago
flowsa - v0.4.1
Changes from v0.4.1: 1. Modified the NETLEIAPlantWater crosswalk sector assignments for "Municipal Solid Waste" (now NAICS 221117) and "Other Gases" (now 221112) (Changes results for FBS Waternational2015m3) 2. Modified the USGSNWISWU crosswalk sector assignment for "Thermoelectric Power". Assignment was to NAICS 2211, assignment changed to 221112, 221113, 221114, 221115, 221116, 221117, and 221118, so Thermoelectric water no longer allocated to hydroelectric or Electric Power Transmission, Control, and Distribution ((Changes results for FBS Waternational2015m1)
- Python
Published by catherinebirney over 4 years ago
flowsa - v0.4.0
New Flow-By-Sectors: 1. CNHWCA2014: Commercial non-hazardous waste, excluding construction activities for state of California 2. CNHWnational2014 (and 2017): Commercial non-hazardous waste, excluding construction activities 3. CNHWCnational2014: Commercial non-hazardous waste from construction activities 4. Electricitygenemissionsnational2016: Import and format data from ElectricityLCI 5. Waternational2015m3: Modified version of m1 where thermoelectric water use is allocated to NAICS6 using EIA Thermoelectric data modified by NETL (and used in electricityLCI) rather than equally allocating 4-digit NAICS to all 6-digit NAICS 6. Waterstate2015m1: Initial version of state level water withdrawal (not finalized)
Changes to Waternational2015_m1: - drop consumptive water use for thermoelectric and irrigation due to limited data
New FBAs: 1. USGS Minerals Yearbook 2. NETLEIAPlantWater: thermoelectric plant water withdrawal/use 3. EIAMECS 2018 data 4. EPACDDPath: Construction debris 5. EPA Nitrogen and Phosphorus Inventories 6. USDAACUP: Agricultural Chemical Program (Fertilizer and Pesticides) 7. USGSSPARROW data 8. CensusVIP (Value of Construction) 9. CalRecycleWasteCharacterization (finalized version)
Other changes: - Implemented GitHub actions, auto test example builds for FBA and FBS - Implemented option to download missing local FBAs from Data Commons when generating a FBS - Option to map allocation source data to federal elementary flow list to standardize flowable/context columns for merging purposes - Add automated method of checking for and dropping duplicated information across activity sets during FBS creation - Moved functions from common.py to new settings.py
New Contributors: Andrew Beck
- Python
Published by catherinebirney over 4 years ago
flowsa - v0.3.1
- convert NEI units to match those in fed mapping file
- update assignment of PM overlap to include UUID and Flowable since it occurs after mapping
- modify allocation of public supply deliveries to domestic in water methodology to correctly allocate to ground/surface water
- address dropped Flowable for Employment FBS
- new default column groupings for a FBA df mapped to fed flow list that include sector cols
- update sourcename for "BEAUseDetailPROBeforeRedef" FBA
- modify usgs sodaash parsing to address error in 2017 data
- option to ignore meta column in compareFBSresults() fxn
- Python
Published by catherinebirney over 4 years ago
flowsa - v0.3
Major changes: - Address esupy dependency conflict between stewi and flowsa - add function seeAvailableModels() to intit.py for use in the examples folder so users can find existing models - updates to Census PEP and Census CBP FBA addressing errors in creating FBAs for certain years (existing url calls overloaded the API and some years of data were not available) - within USDACoACropland data, equally allocate parent NAICS to child NAICS when can't use 'Land in Farms' data. Reduces data loss for water methodology 'crop irrigation' - correct allocation error within Landnational2012: area for urban land for residential housing should have been assigned a value from the American Housing Survey. Instead, the AHS value was subtracted from the total MLU activity. Correction results in reduction of 11.5% land assigned to urban residential housing - updates pandas and xlrd requirements to address any possible errors with loading xlsx and xls files
Minor changes: - modify FBS yaml to drop unnecessary parameters - moved some functions to new allocation.py file - renamed "mapping.py" to "sectormapping.py" to clarify sector mapping from fedefelm mapping - add explanations throughout water and land FBS for why FlowAmounts are modified - incorporate more validation checks within FBS.py
- Python
Published by catherinebirney over 4 years ago
flowsa - v0.2.1
- FBS includes FlowUUID column
- All relevant files (json metadata and log) are also downloaded from Data Commons when downloading a parquet
- New log file for code validation
- No units are modified for any imported Flow-By-Activity
- Addressed the Census_CBP FBA generation errors for 2010 and 2011
- Python
Published by catherinebirney almost 5 years ago
flowsa - v0.2
- Generate Flow-By-Activity (FBA) metadata, store in user's local directory
- Generate Flow-By-Sector (FBS) metadata by calling on previously generated FBA meta, store in user's local directory
- Remove "date_generated" parameter from FBA method yaml, instead capture the date a dataset is generated in the meta
- Add option to download FBA from Data Commons remote repository rather than running the FBA
- Ensure all FBAs called on within a function for allocation purposes are captured within metadata and bibliography files
- Dynamically import functions required to generate a FBA or FBS rather than importing all possible functions within flowbyactivity.py or flowbysector.py
- Python
Published by catherinebirney almost 5 years ago
flowsa - v0.1.1
Update to methodology in Landnational2012 Flow-By-Sector for USDAERSMLU Activity Set 3 ('Cropland used for pasture', 'Forest-use land grazed', 'Grassland pasture and range') : - Estimate suppressed data in USDACoACropland_NAICS by equally allocating parent NAICS to child NAICS, reduces data loss from ~1% to 0%
Update to methodology in Landnational2012 Flow-By-Sector for USDAERSMLU Activity Set 1 and 2 (Cropland) : - Adds 'Berry Totals' and 'Orchards'
Update to methodology in Employmentnational2017 Flow-By-Sector to ensure all sectors are NAICS2012Code
Create Flow-By-Activity method yamls for FBAs previously created in the 'Scripts' directory to streamline FBA generation
- Python
Published by catherinebirney about 5 years ago
flowsa - v0.1
Produces FlowBySector datasets for use in USEEIOv2.0. - Revised methodology for CAPHAPnational2017 and Landnational_2012 - FLOWSA package restructured
- Python
Published by catherinebirney about 5 years ago
flowsa - v0.0.2
Produces flowbysector datasets for use in USEEIOR0.2.0-v2a. FlowBySector datasets: 1. CAPHAPnational2017 2. CRHWnational2017 3. Employmentnational2017 4. GRDRELnational2017 5. Landnational2012 6. TRIDMRnational2017 7. Waternational2015_m1
- Python
Published by catherinebirney over 5 years ago