12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157 |
- Tor Bandwidth File Format
- juga
- teor
- 1. Scope and preliminaries
- This document describes the format of Tor's Bandwidth File, version
- 1.0.0 and later.
- It is a new specification for the existing bandwidth file format,
- which we call version 1.0.0. It also specifies new format versions
- 1.1.0 and later, which are backwards compatible with 1.0.0 parsers.
- Since Tor version 0.2.4.12-alpha, the directory authorities use
- the Bandwidth File file called "V3BandwidthsFile" generated by
- Torflow [1]. The details of this format are described in Torflow's
- README.spec.txt. We also summarise the format in this specification.
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
- NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
- "OPTIONAL" in this document are to be interpreted as described in
- RFC 2119.
- 1.2. Acknowledgements
- The original bandwidth generator (Torflow) and format was
- created by mike. Teor suggested to write this specification while
- contributing on pastly's new bandwidth generator implementation.
- This specification was revised after feedback from:
- Nick Mathewson (nickm)
- Iain Learmonth (irl)
- 1.3 Outline
- The Tor directory protocol (dir-spec.txt [3]) sections 3.4.1
- and 3.4.2, use the term bandwidth measurements, to refer to what
- here is called Bandwidth File.
- A Bandwidth File contains information on relays' bandwidth
- capacities and is produced by bandwidth generators, previously known
- as bandwidth scanners.
- 1.4. Format Versions
- 1.0.0 - The legacy Bandwidth File format
- 1.1.0 - Add a header containing information about the bandwidth
- file. Document the sbws and Torflow relay line keys.
- 1.2.0 - If there are not enough eligible relays, the bandwidth file
- SHOULD contain a header, but no relays. (To match Torflow's
- existing behaviour.)
- Adds new KeyValue Lines to the Header List section with
- statistics about the number of relays included in the file.
- Add new KeyValues to Relay Bandwidth Lines, with different
- bandwidth values (averages and descriptor bandwidths).
- 1.3.0 - Adds scanner and destination countries to the header.
- 1.4.0 - Adds monitoring KeyValues to the header and relay lines.
- RelayLines for excluded relays MAY be present in the bandwidth
- file for diagnostic reasons. Similarly, if there are not enough
- eligible relays, the bandwidth file MAY contain all known relays.
- Diagnostic relay lines SHOULD be marked with vote=0, and
- Tor SHOULD NOT use their bandwidths in its votes.
- All Tor versions can consume format version 1.0.0.
- All Tor versions can consume format version 1.1.0 and later,
- but Tor versions earlier than 0.3.5.1-alpha warn if the header
- contains any KeyValue lines after the Timestamp.
- Tor versions 0.4.0.3-alpha, 0.3.5.8, 0.3.4.11, and earlier do not
- understand "vote=0". Instead, they will vote for the actual bandwidths
- that sbws puts in diagnostic relay lines:
- * 1 for relays with "unmeasured=1", and
- * the relay's measured and scaled bandwidth when "under_min_report=1".
- 2. Format details
- The Bandwidth File MUST contain the following sections:
- - Header List (exactly once), which is a partially ordered list of
- - Header Lines (one or more times), then
- - Relay Lines (zero or more times), in an arbitrary order.
- If it does not contain these sections, parsers SHOULD ignore the file.
- 2.1. Definitions
- The following nonterminals are defined in Tor directory protocol
- sections 1.2., 2.1.1., 2.1.3.:
- bool
- Int
- SP (space)
- NL (newline)
- KeywordChar
- ArgumentChar
- nickname
- hexdigest (a '$', followed by 40 hexadecimal characters
- ([A-Fa-f0-9]))
- Nonterminal defined section 2 of version-spec.txt [4]:
- version_number
- We define the following nonterminals:
- Line ::= ArgumentChar* NL
- RelayLine ::= KeyValue (SP KeyValue)* NL
- HeaderLine ::= KeyValue NL
- KeyValue ::= Key "=" Value
- Key ::= (KeywordChar | "_")+
- Value ::= ArgumentCharValue+
- ArgumentCharValue ::= any printing ASCII character except NL and SP.
- Terminator ::= "=====" or "===="
- Generators SHOULD use a 5-character terminator.
- Timestamp ::= Int
- Bandwidth ::= Int
- MasterKey ::= a base64-encoded Ed25519 public key, with
- padding characters omitted.
- DateTime ::= "YYYY-MM-DDTHH:MM:SS", as in ISO 8601
- CountryCode ::= Two capital ASCII letters ([A-Z]{2}), as defined in
- ISO 3166-1 alpha-2 plus "ZZ" to denote unknown country
- (eg the destination is in a Content Delivery Network).
- CountryCodeList ::= One or more CountryCode(s) separated by a comma
- ([A-Z]{2}(,[A-Z]{2})*).
- Note that key_value and value are defined in Tor directory protocol
- with different formats to KeyValue and Value here.
- Tor versions earlier than 0.3.5.1-alpha require all lines in the file
- to be 510 characters or less. The previous limit was 254 characters in
- Tor 0.2.6.2-alpha and earlier. Parsers MAY ignore longer Lines.
- Note that directory authorities are only supported on the two most
- recent stable Tor versions, so we expect that line limits will be
- removed after Tor 0.4.0 is released in 2019.
- 2.2. Header List format
- It consists of a Timestamp line and zero or more HeaderLines.
- All the header lines MUST conform to the HeaderLine format, except
- the first Timestamp line.
- The Timestamp line is not a HeaderLine to keep compatibility with
- the legacy Bandwidth File format.
- Some header Lines MUST appear in specific positions, as documented
- below. All other Lines can appear in any order.
- If a parser does not recognize any extra material in a header Line,
- the Line MUST be ignored.
- If a header Line does not conform to this format, the Line SHOULD be
- ignored by parsers.
- It consists of:
- Timestamp NL
- [At start, exactly once.]
- The Unix Epoch time in seconds of the most recent generator bandwidth
- result.
- If the generator implementation has multiple threads or
- subprocesses which can fail independently, it SHOULD take the most
- recent timestamp from each thread and use the oldest value. This
- ensures all the threads continue running.
- If there are threads that do not run continuously, they SHOULD be
- excluded from the timestamp calculation.
- If there are no recent results, the generator MUST NOT generate a new
- file.
- It does not follow the KeyValue format for backwards compatibility
- with version 1.0.0.
- "version=" version_number NL
- [In second position, zero or one time.]
- The specification document format version.
- It uses semantic versioning [5].
- This Line was added in version 1.1.0 of this specification.
- Version 1.0.0 documents do not contain this Line, and the
- version_number is considered to be "1.0.0".
- "software=" Value NL
- [Zero or one time.]
- The name of the software that created the document.
- This Line was added in version 1.1.0 of this specification.
- Version 1.0.0 documents do not contain this Line, and the software
- is considered to be "torflow".
- "software_version=" Value NL
- [Zero or one time.]
- The version of the software that created the document.
- The version may be a version_number, a git commit, or some other
- version scheme.
- This Line was added in version 1.1.0 of this specification.
- "file_created=" DateTime NL
- [Zero or one time.]
- The date and time timestamp in ISO 8601 format and UTC time zone
- when the file was created.
- This Line was added in version 1.1.0 of this specification.
- "generator_started=" DateTime NL
- [Zero or one time.]
- The date and time timestamp in ISO 8601 format and UTC time zone
- when the generator started.
- This Line was added in version 1.1.0 of this specification.
- "earliest_bandwidth=" DateTime NL
- [Zero or one time.]
- The date and time timestamp in ISO 8601 format and UTC time zone
- when the first relay bandwidth was obtained.
- This Line was added in version 1.1.0 of this specification.
- "latest_bandwidth=" DateTime NL
- [Zero or one time.]
- The date and time timestamp in ISO 8601 format and UTC time zone
- of the most recent generator bandwidth result.
- This time MUST be identical to the initial Timestamp line.
- This duplicate value is included to make the format easier for people
- to read.
- This Line was added in version 1.1.0 of this specification.
- "number_eligible_relays=" Int NL
- [Zero or one time.]
- The number of relays that have enough measurements to be
- included in the bandwidth file.
- This Line was added in version 1.2.0 of this specification.
- "minimum_percent_eligible_relays=" Int NL
- [Zero or one time.]
- The percentage of relays in the consensus that SHOULD be
- included in every generated bandwidth file.
- If this threshold is not reached, format versions 1.3.0 and earlier
- SHOULD NOT contain any relays. (Bandwidth files always include a
- header.)
- Format versions 1.4.0 and later SHOULD include all the relays for
- diagnostic purposes, even if this threshold is not reached. But these
- relays SHOULD be marked so that Tor does not vote on them.
- See section 1.4 for details.
- The minimum percentage is 60% in Torflow, so sbws uses
- 60% as the default.
- This Line was added in version 1.2.0 of this specification.
- "number_consensus_relays=" Int NL
- [Zero or one time.]
- The number of relays in the consensus.
- This Line was added in version 1.2.0 of this specification.
- "percent_eligible_relays=" Int NL
- [Zero or one time.]
- The number of eligible relays, as a percentage of the number
- of relays in the consensus.
- This line SHOULD be equal to:
- (number_eligible_relays * 100.0) / number_consensus_relays
- to the number of relays in the consensus to include in this file.
- This Line was added in version 1.2.0 of this specification.
- "minimum_number_eligible_relays=" Int NL
- [Zero or one time.]
- The minimum number of relays that SHOULD be included in the bandwidth
- file. See minimum_percent_eligible_relays for details.
- This line SHOULD be equal to:
- number_consensus_relays * (minimum_percent_eligible_relays / 100.0)
- This Line was added in version 1.2.0 of this specification.
- "scanner_country=" CountryCode NL
- [Zero or one time.]
- The country, as in political geolocation, where the generator is run.
- This Line was added in version 1.3.0 of this specification.
- "destinations_countries=" CountryCodeList NL
- [Zero or one time.]
- The country, as in political geolocation, or countries where the
- destination Web server(s) are located.
- The destination Web Servers serve the data that the generator retrieves
- to measure the bandwidth.
- This Line was added in version 1.3.0 of this specification.
- "recent_consensus_count=" Int NL
- [Zero or one time.].
- The number of the different consensuses seen in the last data_period
- days. (data_period is 5 by default.)
- Assuming that Tor clients fetch a consensus every 1-2 hours,
- and that the data_period is 5 days, the Value of this Key SHOULD be
- between:
- data_period * 24 / 2 = 60
- data_period * 24 = 120
- This Line was added in version 1.4.0 of this specification.
- "recent_priority_list_count=" Int NL
- [Zero or one time.]
- The number of times that a list with a subset of relays prioritized
- to be measured has been created in the last data_period days.
- (data_period is 5 by default.)
- In 2019, with 7000 relays in the network, the Value of this Key SHOULD be
- approximately:
- data_period * 24 / 1.5 = 80
- Being 1.5 the approximate number of hours it takes to measure a
- priority list of 7000 * 0.05 (350) relays, when the fraction of relays
- in a priority list is the 5% (0.05).
- This Line was added in version 1.4.0 of this specification.
- "recent_priority_relay_count=" Int NL
- [Zero or one time.]
- The number of relays that has been in in the list of relays prioritized
- to be measured in the last data_period days. (data_period is 5 by
- default.)
- In 2019, with 7000 relays in the network, the Value of this Key SHOULD be
- approximately:
- 80 * (7000 * 0.05) = 28000
- Being 0.05 (5%) the fraction of relays in a priority list and 80
- the approximate number of priority lists (see
- "recent_priority_list_count").
- This Line was added in version 1.4.0 of this specification.
- "recent_measurement_attempt_count=" Int NL
- [Zero or one time.]
- The number of times that any relay has been queued to be measured
- in the last data_period days. (data_period is 5 by default.)
- In 2019, with 7000 relays in the network, the Value of this Key SHOULD be
- approximately the same as "recent_priority_relay_count",
- assuming that there is one attempt to measure a relay for each relay that
- has been prioritized unless there are system, network or implementation
- issues.
- This Line was added in version 1.4.0 of this specification.
- "recent_measurement_failure_count=" Int NL
- [Zero or one time.]
- The number of times that the scanner attempted to measure a relay in
- the last data_period days (5 by default), but the relay has not been
- measured because of system, network or implementation issues.
- This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_error_count=" Int NL
- [Zero or one time.]
- The number of relays that have no successful measurements in the last
- data_period days (5 by default).
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_near_count=" Int NL
- [Zero or one time.]
- The number of relays that have some successful measurements in the last
- data_period days (5 by default), but all those measurements were
- performed in a period of time that was too short (by default 1 day).
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_old_count=" Int NL
- [Zero or one time.]
- The number of relays that have some successful measurements, but all
- those measurements are too old (more than 5 days, by default).
- Excludes relays that are already counted in
- recent_measurements_excluded_near_count.
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This Line was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_few_count=" Int NL
- [Zero or one time.]
- The number of relays that don't have enough recent successful
- measurements. (Fewer than 2 measurements in the last 5 days, by
- default).
- Excludes relays that are already counted in
- recent_measurements_excluded_near_count and
- recent_measurements_excluded_old_count.
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This Line was added in version 1.4.0 of this specification.
- "time_to_report_half_network=" Int NL
- [Zero or one time.]
- The time in seconds that it would take to report measurements about the
- half of the network, given the number of eligible relays and the time
- it took in the last days (5 days, by default).
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This Line was added in version 1.4.0 of this specification.
- KeyValue NL
- [Zero or more times.]
- There MUST NOT be multiple KeyValue header Lines with the same key.
- If there are, the parser SHOULD choose an arbitrary Line.
- If a parser does not recognize a Keyword in a KeyValue Line, it
- MUST be ignored.
- Future format versions may include additional KeyValue header Lines.
- Additional header Lines will be accompanied by a minor version
- increment.
- Implementations MAY add additional header Lines as needed. This
- specification SHOULD be updated to avoid conflicting meanings for
- the same header keys.
- Parsers MUST NOT rely on the order of these additional Lines.
- Additional header Lines MUST NOT use any keywords specified in the
- relay measurements format.
- If there are, the parser MAY ignore conflicting keywords.
- Terminator NL
- [Zero or one time.]
- The Header List section ends with a Terminator.
- In version 1.0.0, Header List ends when the first relay bandwidth
- is found conforming to the next section.
- Implementations of version 1.1.0 and later SHOULD use a 5-character
- terminator.
- Tor 0.4.0.1-alpha and later look for a 5-character terminator,
- or the first relay bandwidth line. sbws versions 0.1.0 to 1.0.2
- used a 4-character terminator, this bug was fixed in 1.0.3.
- 2.3. Relay Line format
- It consists of zero or more RelayLines containing relay ids and
- bandwidths. The relays and their KeyValues are in arbitrary order.
- There MUST NOT be multiple KeyValue pairs with the same key in the same
- RelayLine. If there are, the parser SHOULD choose an arbitrary Value.
- There MUST NOT be multiple RelayLines per relay identity (node_id or
- master_key_ed25519). If there are, parsers SHOULD issue a warning.
- Parers MAY reject the file, choose an arbitrary RelayLine, or ignore
- both RelayLines.
- If a parser does not recognize any extra material in a RelayLine,
- the extra material MUST be ignored.
- Each RelayLine includes the following KeyValue pairs:
- "node_id=" hexdigest
- [Exactly once.]
- The fingerprint for the relay's RSA identity key.
- Note: In bandwidth files read by Tor versions earlier than
- 0.3.4.1-alpha, node_id MUST NOT be at the end of the Line.
- These authority versions are no longer supported.
- Current Tor versions ignore master_key_ed25519, so node_id MUST be
- present in each relay Line.
- Implementations of version 1.1.0 and later SHOULD include both node_id
- and master_key_ed25519. Parsers SHOULD accept Lines that contain at
- least one of them.
- "master_key_ed25519=" MasterKey
- [Zero or one time.]
- The relays's master Ed25519 key, base64 encoded,
- without trailing "="s, to avoid ambiguity with KeyValue "="
- character.
- This KeyValue pair SHOULD be present, see the note under node_id.
- This KeyValue was added in version 1.1.0 of this specification.
- "bw=" Bandwidth
- [Exactly once.]
- The bandwidth of this relay in kilobytes per second.
- No Zero Bandwidths:
- Tor accepts zero bandwidths, but they trigger bugs in older Tor
- implementations. Therefore, implementations SHOULD NOT produce zero
- bandwidths. Instead, they SHOULD use one as their minimum bandwidth.
- If there are zero bandwidths, the parser MAY ignore them.
- Bandwidth Aggregation:
- Multiple measurements can be aggregated using an averaging scheme,
- such as a mean, median, or decaying average.
- Bandwidth Scaling:
- Torflow scales bandwidths to kilobytes per second. Other
- implementations SHOULD use kilobytes per second for their initial
- bandwidth scaling.
- If different implementations or configurations are used in votes for
- the same network, their measurements MAY need further scaling. See
- Appendix B for information about scaling, and one possible scaling
- method.
- MaxAdvertisedBandwidth:
- Bandwidth generators MUST limit the relays' measured bandwidth based
- on the MaxAdvertisedBadwidth.
- A relay's MaxAdvertisedBandwidth limits the bandwidth-avg in its
- descriptor. bandwidth-avg is the minimum of MaxAdvertisedBandwidth,
- BandwidthRate, RelayBandwidthRate, BandwidthBurst, and
- RelayBandwidthBurst.
- Therefore, generators MUST limit a relay's measured bandwidth to its
- descriptor's bandwidth-avg. This limit needs to be implemented in the
- generator, because generators may scale consensus weights before
- sending them to Tor.
- Generators SHOULD NOT limit measured bandwidths based on descriptors'
- bandwidth-observed, because that penalises new relays.
- sbws limits the relay's measured bandwidth to the bandwidth-avg
- advertised.
- Torflow partitions relays based on their bandwidth. For unmeasured
- relays, Torflow uses the minimum of all descriptor bandwidths,
- including bandwidth-avg (MaxAdvertisedBandwidth) and
- bandwidth-observed. Then Torflow measures the relays in each partition
- against each other, which implicitly limits a relay's measured
- bandwidth to the bandwidths of similar relays.
- Torflow also generates consensus weights based on the ratio between the
- measured bandwidth and the minimum of all descriptor bandwidths (at the
- time of the measurement). So when an operator reduces the
- MaxAdvertisedBandwidth for a relay, Torflow reduces that relay's
- measured bandwidth.
- KeyValue
- [Zero or more times.]
- Future format versions may include additional KeyValue pairs on a
- RelayLine.
- Additional KeyValue pairs will be accompanied by a minor version
- increment.
- Implementations MAY add additional relay KeyValue pairs as needed.
- This specification SHOULD be updated to avoid conflicting meanings
- for the same Keywords.
- Parsers MUST NOT rely on the order of these additional KeyValue
- pairs.
- Additional KeyValue pairs MUST NOT use any keywords specified in the
- header format.
- If there are, the parser MAY ignore conflicting keywords.
- 2.4. Implementation details
- 2.4.1 Writing bandwidth files atomically
- To avoid inconsistent reads, implementations SHOULD write bandwidth files
- atomically. If the file is transferred from another host, it SHOULD be
- written to a temporary path, then renamed to the V3BandwidthsFile path.
- sbws versions 0.7.0 and later write the bandwidth file to an archival
- location, create a temporary symlink to that location, then atomically rename
- the symlink
- to the configured V3BandwidthsFile path.
- Torflow does not write bandwidth files atomically.
- 2.4.2. Additional KeyValue pair definitions
- KeyValue pairs in RelayLines that current implementations generate.
- 2.4.2.1. Simple Bandwidth Scanner
- sbws RelayLines contain these keys:
- "node_id=" hexdigest
- As above.
- "bw=" Bandwidth
- As above.
- "nick=" nickname
- [Exactly once.]
- The relay nickname.
- Torflow also has a "nick=" KeyValue.
- "rtt=" Int
- [Zero or one time.]
- The Round Trip Time in milliseconds to obtain 1 byte of data.
- This KeyValue was added in version 1.1.0 of this specification.
- It became optional in version 1.3.0 or 1.4.0 of this specification.
- "time=" DateTime
- [Exactly once.]
- The date and time timestamp in ISO 8601 format and UTC time zone
- when the last bandwidth was obtained.
- This KeyValue was added in version 1.1.0 of this specification.
- The Torflow equivalent is "measured_at=".
- "success=" Int
- [Zero or one time.]
- The number of times that the bandwidth measurements for this relay were
- successful.
- This KeyValue was added in version 1.1.0 of this specification.
- "error_circ=" Int
- [Zero or one time.]
- The number of times that the bandwidth measurements for this relay
- failed because of circuit failures.
- This KeyValue was added in version 1.1.0 of this specification.
- The Torflow equivalent is "circ_fail=".
- "error_stream=" Int
- [Zero or one time.]
- The number of times that the bandwidth measurements for this relay
- failed because of stream failures.
- This KeyValue was added in version 1.1.0 of this specification.
- "error_destination=" Int
- [Zero or one time.]
- The number of times that the bandwidth measurements for this relay
- failed because the destination Web server was not available.
- This KeyValue was added in version 1.4.0 of this specification.
- "error_second_relay=" Int
- [Zero or one time.]
- The number of times that the bandwidth measurements for this relay
- failed because sbws could not find a second relay for the test circuit.
- This KeyValue was added in version 1.4.0 of this specification.
- "error_misc=" Int
- [Zero or one time.]
- The number of times that the bandwidth measurements for this relay
- failed because of other reasons.
- This KeyValue was added in version 1.1.0 of this specification.
- "bw_mean=" Int
- [Zero or one time.]
- The measured bandwidth mean for this relay in bytes per second.
- This KeyValue was added in version 1.2.0 of this specification.
- "bw_median=" Int
- [Zero or one time.]
- The measured bandwidth median for this relay in bytes per second.
- This KeyValue was added in version 1.2.0 of this specification.
- "desc_bw_average=" Int
- [Zero or one time.]
- The descriptor average bandwidth for this relay in bytes per second.
- This KeyValue was added in version 1.2.0 of this specification.
- "desc_obs_bw_last=" Int
- [Zero or one time.]
- The last descriptor observed bandwidth for this relay in bytes per
- second.
- This KeyValue was added in version 1.2.0 of this specification.
- "desc_obs_bw_mean=" Int
- [Zero or one time.]
- The descriptor observed bandwidth mean for this relay in bytes per
- second.
- This KeyValue was added in version 1.2.0 of this specification.
- "relay_recent_measurements_excluded_error_count=" Int
- [Zero or one time.]
- The number of recent relay measurement attempts that failed.
- Measurements are recent if they are in the last data_period days
- (5 by default).
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This KeyValue was added in version 1.4.0 of this specification.
- "relay_recent_measurements_excluded_near_count=" Int
- [Zero or one time.]
- When all of a relay's recent successful measurements were performed in
- a period of time that was too short (by default 1 day), the relay is
- excluded. This KeyValue contains the number of recent successful
- measurements for the relay that were ignored for this reason.
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This KeyValue was added in version 1.4.0 of this specification.
- "relay_recent_measurements_excluded_old_count=" Int
- [Zero or one time.]
- The number of successful measurements for this relay that are too old
- (more than data_period days, 5 by default).
- Excludes measurements that are already counted in
- relay_recent_measurements_excluded_near_count.
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This KeyValue was added in version 1.4.0 of this specification.
- "recent_measurements_excluded_few_count=" Int
- [Zero or one time.]
- The number of successful measurements for this relay that were ignored
- because the relay did not have enough successful measurements (fewer
- than 2, by default).
- Excludes measurements that are already counted in
- relay_recent_measurements_excluded_near_count or
- relay_recent_measurements_excluded_old_count.
- (See the note in section 1.4, version 1.4.0, about excluded relays.)
- This KeyValue was added in version 1.4.0 of this specification.
- "under_min_report=" bool
- [Zero or one time.]
- If the value is 1, there are not enough eligible relays in the
- bandwidth file, and Tor bandwidth authorities MAY NOT vote on this
- relay. (Current Tor versions do not change their behaviour based on
- the "under_min_report" key.)
- If the value is 0 or the KeyValue is not present, there are enough
- relays in the bandwidth file.
- Because Tor versions released before April 2019 (see section 1.4. for
- the full list of versions) ignore "vote=0", generator implementations
- MUST NOT change the bandwidths for under_min_report relays. Using the
- same bw value makes authorities that do not understand "vote=0"
- or "under_min_report=1" produce votes that don't change relay weights
- too much. It also avoids flapping when the reporting threshold is
- reached.
- This KeyValue was added in version 1.4.0 of this specification.
- "unmeasured=" bool
- [Zero or one time.]
- If the value is 1, this relay was not successfully measured and
- Tor bandwidth authorities MAY NOT vote on this relay.
- (Current Tor versions do not change their behaviour based on
- the "unmeasured" key.)
- If the value is 0 or the KeyValue is not present, this relay
- was successfully measured.
- Because Tor versions released before April 2019 (see section 1.4. for
- the full list of versions) ignore "vote=0", generator implementations
- MUST set "bw=1" for unmeasured relays. Using the minimum bw value
- makes authorities that do not understand "vote=0" or "unmeasured=1"
- produce votes that don't change relay weights too much.
- This KeyValue was added in version 1.4.0 of this specification.
- "vote=" bool
- [Zero or one time.]
- If the value is 0, Tor directory authorities SHOULD ignore the relay's
- entry in the bandwidth file. They SHOULD vote for the relay the same
- way they would vote for a relay that is not present in the file.
- This MAY be the case when this relay was not successfully measured but
- it is included in the Bandwidth File, to diagnose why they were not
- measured.
- If the value is 1 or the KeyValue is not present, Tor directory
- authorities MUST use the relay's bw value in any votes for that relay.
- Implementations MUST also set "bw=1" for unmeasured relays.
- But they MUST NOT change the bw for under_min_report relays.
- (See the explanations under "unmeasured" and "under_min_report"
- for more details.)
- This KeyValue was added in version 1.4.0 of this specification.
- 2.4.2.2. Torflow
- Torflow RelayLines include node_id and bw, and other KeyValue pairs [2].
- References:
- 1. https://gitweb.torproject.org/torflow.git
- 2. https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332
- The Torflow specification is outdated, and does not match the current
- implementation. See section A.1. for the format produced by Torflow.
- 3. https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
- 4. https://gitweb.torproject.org/torspec.git/tree/version-spec.txt
- 5. https://semver.org/
- A. Sample data
- The following has not been obtained from any real measurement.
- A.1. Generated by Torflow
- This an example version 1.0.0 document:
- 1523911758
- node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test measured_at=1523911725 updated_at=1523911725 pid_error=4.11374090719 pid_error_sum=4.11374090719 pid_bw=57136645 pid_delta=2.12168374577 circ_fail=0.2 scanner=/filepath
- node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2 measured_at=1523911623 updated_at=1523911623 pid_error=3.96703337994 pid_error_sum=3.96703337994 pid_bw=47422125 pid_delta=2.65469736988 circ_fail=0.0 scanner=/filepath
- A.2. Generated by sbws version 0.1.0
- 1523911758
- version=1.1.0
- software=sbws
- software_version=0.1.0
- latest_bandwidth=2018-04-16T20:49:18
- file_created=2018-04-16T21:49:18
- generator_started=2018-04-16T15:13:25
- earliest_bandwidth=2018-04-16T15:13:26
- ====
- bw=380 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26
- bw=189 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36
- A.3. Generated by sbws version 1.0.3
- 1523911758
- version=1.2.0
- latest_bandwidth=2018-04-16T20:49:18
- file_created=2018-04-16T21:49:18
- generator_started=2018-04-16T15:13:25
- earliest_bandwidth=2018-04-16T15:13:26
- minimum_number_eligible_relays=3862
- minimum_percent_eligible_relays=60
- number_consensus_relays=6436
- number_eligible_relays=6000
- percent_eligible_relays=93
- software=sbws
- software_version=1.0.3
- =====
- bw=38000 bw_mean=1127824 bw_median=1180062 desc_avg_bw=1073741824 desc_obs_bw_last=17230879 desc_obs_bw_mean=14732306 error_circ=0 error_misc=0 error_stream=1 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ nick=Test node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 rtt=380 success=1 time=2018-05-08T16:13:26
- bw=1 bw_mean=199162 bw_median=185675 desc_avg_bw=409600 desc_obs_bw_last=836165 desc_obs_bw_mean=858030 error_circ=0 error_misc=0 error_stream=0 master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I nick=Test2 node_id=$96C15995F30895689291F455587BD94CA427B6FC rtt=378 success=1 time=2018-05-08T16:13:36
- A.3.1. When there are not enough eligible measured relays:
- 1540496079
- version=1.2.0
- earliest_bandwidth=2018-10-20T19:35:52
- file_created=2018-10-25T19:35:03
- generator_started=2018-10-25T11:42:56
- latest_bandwidth=2018-10-25T19:34:39
- minimum_number_eligible_relays=3862
- minimum_percent_eligible_relays=60
- number_consensus_relays=6436
- number_eligible_relays=2960
- percent_eligible_relays=46
- software=sbws
- software_version=1.0.3
- =====
- A.4. Headers generated by sbws version 1.0.4
- 1523911758
- version=1.3.0
- latest_bandwidth=2018-04-16T20:49:18
- destinations_countries=TH,ZZ
- file_created=2018-04-16T21:49:18
- generator_started=2018-04-16T15:13:25
- earliest_bandwidth=2018-04-16T15:13:26
- minimum_number_eligible_relays=3862
- minimum_percent_eligible_relays=60
- number_consensus_relays=6436
- number_eligible_relays=6000
- percent_eligible_relays=93
- scanner_country=SN
- software=sbws
- software_version=1.0.4
- =====
- A.5 Generated by sbws version 1.1.0
- 1523911758
- version=1.4.0
- latest_bandwidth=2018-04-16T20:49:18
- destinations_countries=TH,ZZ
- file_created=2018-04-16T21:49:18
- generator_started=2018-04-16T15:13:25
- earliest_bandwidth=2018-04-16T15:13:26
- minimum_number_eligible_relays=3862
- minimum_percent_eligible_relays=60
- number_consensus_relays=6436
- number_eligible_relays=6000
- percent_eligible_relays=93
- recent_measurement_attempt_count=6243
- recent_measurement_failure_count=732
- recent_measurements_excluded_error_count=969
- recent_measurements_excluded_few_count=3946
- recent_measurements_excluded_near_count=90
- recent_measurements_excluded_old_count=0
- recent_priority_list_count=20
- recent_priority_relay_count=6243
- scanner_country=SN
- software=sbws
- software_version=1.1.0
- time_to_report_half_network=57273
- =====
- bw=1 error_circ=1 error_destination=0 error_misc=0 error_second_relay=0 error_stream=0 master_key_ed25519=J3HQ24kOQWac3L1xlFLp7gY91qkb5NuKxjj1BhDi+m8 nick=snap269 node_id=$DC4D609F95A52614D1E69C752168AF1FCAE0B05F relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=1 relay_recent_measurements_excluded_near_count=3 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=3 time=2019-03-16T18:20:57 unmeasured=1 vote=0
- bw=1 error_circ=0 error_destination=0 error_misc=0 error_second_relay=0 error_stream=2 master_key_ed25519=h6ZB1E1yBFWIMloUm9IWwjgaPXEpL5cUbuoQDgdSDKg nick=relay node_id=$C4544F9E209A9A9B99591D548B3E2822236C0503 relay_recent_measurement_attempt_count=3 relay_recent_measurements_excluded_error_count=2 relay_recent_measurements_excluded_few_count=1 relay_recent_consensus_count=3 relay_recent_priority_list_count=3 success=1 time=2019-03-17T06:50:58 unmeasured=1 vote=0
- B. Scaling bandwidths
- B.1. Scaling requirements
- Tor accepts zero bandwidths, but they trigger bugs in older Tor
- implementations. Therefore, scaling methods SHOULD perform the
- following checks:
- * If the total bandwidth is zero, all relays should be given equal
- bandwidths.
- * If the scaled bandwidth is zero, it should be rounded up to one.
- Initial experiments indicate that scaling may not be needed for
- torflow and sbws, because their measured bandwidths are similar
- enough already.
- B.2. A linear scaling method
- If scaling is required, here is a simple linear bandwith scaling
- method, which ensures that all bandwidth votes contain approximately
- the same total bandwidth:
- 1. Calculate the relay quota by dividing the total measured bandwidth
- in all votes, by the number of relays with measured bandwidth
- votes. In the public tor network, this is approximately 7500 as of
- April 2018. The quota should be a consensus parameter, so it can be
- adjusted for all generators on the network.
- 2. Calculate a vote quota by multiplying the relay quota by the number
- of relays this bandwidth authority has measured
- bandwidths for.
- 3. Calculate a scaling factor by dividing the vote quota by the
- total unscaled measured bandwidth in this bandwidth
- authority's upcoming vote.
- 4. Multiply each unscaled measured bandwidth by the scaling
- factor.
- Now, the total scaled bandwidth in the upcoming vote is
- approximately equal to the quota.
- B.3. Quota changes
- If all generators are using scaling, the quota can be gradually
- reduced or increased as needed. Smaller quotas decrease the size
- of uncompressed consensuses, and may decrease the size of
- consensus diffs and compressed consensuses. But if the relay
- quota is too small, some relays may be over- or under-weighted.
- B.4. Torflow aggreation
- Torflow implements two methods to compute the bandwidth values from the
- (stream) bandwidth measurements: with and without PID control feedback.
- The method described here is without PID control (see Torflow
- specification, section 2.2).
- In the following sections, the relays' measured bandwidth refer to the
- ones that this bandwidth authority has measured for the relays that
- would be included in the next bandwidth authority's upcoming vote.
- 1. Calculate the filtered bandwidth for each relay:
- - choose the relay's measurements (`bw_j`) that are equal or greater
- than the mean of the measurements for this relay
- - calculate the mean of those measurements
- In pseudocode:
- bw_filt_i = mean(max(mean(bw_j), bw_j))
- 2. Calculate network averages:
- - calculate the filtered average by dividing the sum of all the
- relays' filtered bandwidth by the number of relays that have been
- measured (`n`), ie, calculate the mean average of the relays'
- filtered bandwidth.
- - calculate the stream average by dividing the sum of all the
- relays' filtered bandwidth by the number of relays that have been
- measured (`n`), ie, calculate the mean average or the relays'
- measured bandwidth.
- In pseudocode:
- bw_avg_filt_ = bw_filt_i / n
- bw_avg_strm = bw_i / n
- 3. Calculate ratios for each relay:
- - calculate the filtered ratio by dividing each relay filtered
- bandwidth by the filtered average
- - calculate the stream ratio by dividing each relay measured
- bandwidth by the stream average
- In pseudocode:
- r_filt_i = bw_filt_i / bw_avg_filt
- r_strm_i = bw_i / bw_avg_strm
- 4. Calculate the final ratio for each relay:
- The final ratio is the larger between the filtered bandwidth and the
- stream bandwidth.
- In pseudocode:
- r_i = max(r_filt_i, r_strm_i)
- 5. Calculate the scaled bandwidth for each relay:
- The most recent descriptor observed bandwidth (`bw_obs_i`) is
- multiplied by the ratio
- In pseudocode:
- bw_new_i = r_i * bw_obs_i
- <<In this way, the resulting network status consensus bandwidth
- values are effectively re-weighted proportional to how much faster
- the node was as compared to the rest of the network.>>
|