aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/main/java/com/google/devtools/build/lib/syntax/Lexer.java
Commit message (Collapse)AuthorAge
* ReformatingGravatar laurentlb2018-06-11
| | | | | | | | Switch statements were poorly formatted. Fixing it in a separate commit so that it doesn't clutter the diff. RELNOTES: None. PiperOrigin-RevId: 200062930
* Stop allocating new tokens in the lexerGravatar laurentlb2018-06-05
| | | | | | | | | | There's only one Token and it gets reused. This reduces the memory usage of the lexer. Parsing time seems to be 5%-10% faster with this change on a large file. This makes little difference on the overall performance of Bazel though. RELNOTES: None. PiperOrigin-RevId: 199310860
* Get rid of the tokens queue in the lexerGravatar laurentlb2018-06-04
| | | | | | | Next step will be to skip token allocation. RELNOTES: None. PiperOrigin-RevId: 199121625
* Reject files when the first line is indented.Gravatar laurentlb2018-05-24
| | | | | | | | | | | | A bug in the lexer ignored indentation on the first line of a file. This now causes an error. Also, remove the COMMENT token from the lexer. Comments are now accessed separately. This will allow further optimizations in the lexer. It also aligns the code a bit more with the Go implementation. RELNOTES[INC]: Indentation on the first line of a file was previously ignored. This is now fixed. PiperOrigin-RevId: 197889775
* Skylark: do not eagerly scan the whole fileGravatar laurentlb2018-05-22
| | | | | | | | With this change, the parser explicitly asks the lexer to give the next token. To avoid changing the lexer too much, the tokenize() method populates a queue (it may add multiple tokens at the same time). While this reduces the peak memory usage, further work is needed to actually improve the performance. RELNOTES: None. PiperOrigin-RevId: 197576326
* Change profiling to only accept strings for its "description" argument. ↵Gravatar janakr2018-04-01
| | | | | | Profiling can hold onto objects for the duration of the build, and some of those objects may be temporary that should not be persisted. In particular, UnixGlob and its inner classes should not outlive loading and analysis. For the most part, care was taken in this CL to only use strings that required no additional construction, mainly to minimize garbage (retaining references to newly created strings is not as great a concern since only the strings corresponding to the slowest K tasks are retained, for some relatively small values of K). Action descriptions for actually executing actions are eagerly expanded because that work is minimal compared to the work of actually executing an action. PiperOrigin-RevId: 191251488
* Deletes CODEC fields now that they are no longer needed.Gravatar shahan2018-02-28
| | | | PiperOrigin-RevId: 187397314
* Codec for Location.Gravatar shahan2018-01-16
| | | | | | * Moves SingletonCodec to third_party. PiperOrigin-RevId: 182143153
* Skylark parser: make the end position of location ranges inclusive.Gravatar fzaiser2017-10-06
| | | | | | | | | Previously, the end line and column of a location were the position past the actual end of a location. This makes sense for the end offset, because one can use `input.substring(startOffset, endOffset)` to get the string belonging to an ASTNode. However the line and column (as opposed to the offset) aren't used for that. Therefore I made the change that the end line and column now point to the last character in the location. This is also they way every compiler I know does it. RELNOTES: none PiperOrigin-RevId: 170723732
* Fix wrong location of string literals in the lexerGravatar fzaiser2017-09-01
| | | | | RELNOTES: none PiperOrigin-RevId: 167263494
* Fix lexer bug that allowed non-ASCII letters in identifiersGravatar fzaiser2017-08-17
| | | | | RELNOTES: None PiperOrigin-RevId: 165434934
* Misc cleanups of syntax dirGravatar brandjon2017-07-12
| | | | | RELNOTES: None PiperOrigin-RevId: 161560683
* Make 'load' a keywordGravatar laurentlb2017-07-05
| | | | | | | RELNOTES[INC]: `load` is now a language keyword, it cannot be used as an identifier PiperOrigin-RevId: 160944121
* Forbid tabs for indentationGravatar laurentlb2017-07-05
| | | | | | RELNOTES[INC]: Using tabulation for identation is now fobidden in .bzl files PiperOrigin-RevId: 160888064
* Forbid octal sequences greater than \377 (0xff) in strings.Gravatar laurentlb2017-06-26
| | | | | | | | RELNOTES[INC]: In strings, octal sequences greater than \377 are now forbidden (e.g. "\\600"). Previously, Blaze had the same behavior as Python 2, where "\\450" == "\050". PiperOrigin-RevId: 160147169
* Add support for the '0o' octal prefix for integers.Gravatar laurentlb2017-06-16
| | | | | | RELNOTES: Octal prefix '0' is deprecated in favor of '0o' (use 0o777 instead of 0777). PiperOrigin-RevId: 158982649
* Add operator // for division.Gravatar laurentlb2017-05-22
| | | | | | | | In both Python 2 and Python 3, the operator // is used for int division. With Python 3, operator / is for float division. RELNOTES: None. PiperOrigin-RevId: 156582262
* Global cleanup change.Gravatar Googler2017-03-14
| | | | | | -- PiperOrigin-RevId: 150051360 MOS_MIGRATED_REVID=150051360
* Use skylark-preferred quote char for string literalGravatar Michajlo Matijkiw2017-02-23
| | | | | | | | | | We currently have no need to discern between strings quoted with ' or ". While it could be nice for something one day (and may have been in the past), it's yagni now. Removing the distinction simplifies string concatenation. -- PiperOrigin-RevId: 148273400 MOS_MIGRATED_REVID=148273400
* Minor code cleanupGravatar Laurent Le Brun2017-02-16
| | | | | | | | Inlining the function makes the code more readable. -- PiperOrigin-RevId: 147711468 MOS_MIGRATED_REVID=147711468
* Fixed StringIndexOutOfBoundsException in the lexerGravatar Vladimir Moskva2016-11-08
| | | | | -- MOS_MIGRATED_REVID=138387292
* Automated cleanupGravatar Laurent Le Brun2016-10-12
| | | | | -- MOS_MIGRATED_REVID=135816105
* Remove support for "Python" parsing mode.Gravatar Laurent Le Brun2016-10-07
| | | | | | | It was unused in Bazel. -- MOS_MIGRATED_REVID=135483937
* Return EMTPY_FRAGMENT instead of null when there is no filename for a Lexer. ↵Gravatar John Cater2016-10-04
| | | | | | | | | Fixes #1865. -- Change-Id: I29ad4eab2c0fbf543989029429449c5023f3b065 Reviewed-on: https://bazel-review.googlesource.com/#/c/6350 MOS_MIGRATED_REVID=134824338
* Fix handling of backslash-escaped CRLF line terminators.Gravatar Lukacs Berki2016-06-23
| | | | | | | | | The character sequences in the test cases behave the same way Python does. Fixed #1306. -- MOS_MIGRATED_REVID=125568600
* Add support for more augmented-assignment operators.Gravatar Googler2016-06-03
| | | | | | | | | based on my limited understanding of python syntax, the only things we don't support is //= and **=, but it looks like skylark doesn't support the corresponding infix operators. RELNOTES[NEW]: add support for the '-=', '*=', '/=', and'%=' operators to skylark. Notably, we do not support '|=' because the semantics of skylark sets are sufficiently different from python sets. -- MOS_MIGRATED_REVID=123889776
* Make the parser handle CRLF correctly.Gravatar Lukacs Berki2016-05-24
| | | | | | | Fixes #1300 . -- MOS_MIGRATED_REVID=123090421
* Lexer: Handle triple quoted raw strings (e.g. r"""abc""").Gravatar Laurent Le Brun2015-10-21
| | | | | -- MOS_MIGRATED_REVID=105956734
* Rollback of unknown previous commit.Gravatar Googler2015-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *** Reason for rollback *** Broke bazel build. *** Original change description *** Update iossim for Xcode 7 support. As of Xcode 7, supportedDeviceTypesByName was replaced by supportedDeviceTypesByAlias. This is from latest chromium build. Downstream hash is 9dd179a339c0457f8754069e0774b38f69c258a8. The latest merge was to upstream ef05b7da00844c0d500c4a7f20d4095dab56e7fe *** Also includes the following changes: Size the Lexer tokenization to minimize internal resizing. This value is chosen empirically. -- Fixes toolchain selection in the generated Android NDK crosstools by making each target_cpu and compiler field unique. Note that there are some problems with the clang compilers (e.g. can't find ld), which I'll fix in a subsequent change. -- Update iossim for Xcode 7 support. As of Xcode 7, supportedDeviceTypesByName was replaced by supportedDeviceTypesByAlias. This is from latest chromium build. Downstream hash is 9dd179a339c0457f8754069e0774b38f69c258a8. The latest merge was to upstream ef05b7da00844c0d500c4a7f20d4095dab56e7fe -- MOS_MIGRATED_REVID=105337154
* Rationalize copyright headersGravatar Damien Martin-Guillerez2015-09-25
| | | | | | | | | | | The headers were modified with `find . -type f -exec 'sed' '-Ei' 's|Copyright 201([45]) Google|Copyright 201\1 The Bazel Authors|' '{}' ';'` And manual edit for not Google owned copyright. Because of the nature of ijar, I did not modified the header of file owned by Alan Donovan. The list of authors were extracted from the git log. It is missing older Google contributors that can be added on-demand. -- MOS_MIGRATED_REVID=103938715
* Add profiling for Skylark lexer, parser, user- and built-in functions.Gravatar Googler2015-08-28
| | | | | -- MOS_MIGRATED_REVID=101769963
* Introduce '|' operator for set union.Gravatar Laurent Le Brun2015-08-25
| | | | | -- MOS_MIGRATED_REVID=101363350
* Make two Skyframe nodes with the same events and values equal.Gravatar Janak2015-07-13
| | | | | | | | | | | | | | | | | | | | | | We do this by implementing equality for TaggedEvents (and all objects it transitively includes). Before this change, if a Skyframe node re-evaluated to the same value as in the previous build, but had (transitive) events, change pruning would not cut off the evaluation of its parents. This is not a big issue in practice because most nodes that would re-evaluate to the same value (like FileValues or GlobValues) never emit events, and others (like ActionExecutionValues) have secondary caches that mask this effect. Also do a drive-by fix where we were using the hash code of a nested set instead of the shallow hash code (didn't have any bad effects in practice because we never hash these values). (Minor formatting clean-ups from https://bazel-review.googlesource.com/1610 ) -- Change-Id: I751a8479627f0456993c5ec8834528aeb593d736 Reviewed-on: https://bazel-review.googlesource.com/1610 MOS_MIGRATED_REVID=98115908
* Allow users of Blaze Lexer to explicitly specify the line-number table of a ↵Gravatar Carmi Grushko2015-06-16
| | | | | | | file. -- MOS_MIGRATED_REVID=96039514
* Remove Path from Location, ParserInputSource and bunch of other low-level ↵Gravatar Lukacs Berki2015-06-12
| | | | | | | | | | | | | | | | | classes. This makes the code cleaner because a lot of places never read the file and thus never needed a Path in the first place. I got to this change in a bit convoluted way: - I wanted the default tools in Android rules to point to //external: - I wanted to make sure that that doesn't cause an error is no Android rules are built, thus I had to add some binding for them in the default WORKSPACE file - I wanted the Android rules not to depend on Bazel core with an eye towards eventually moving them to a separate jar / Skylark code - The default WORKSPACE file is currently composed from files extracted by the Bazel launcher which would make the Android rules depend on a very core mechanism - I couldn't simply pass in jdk.WORKSPACE as a String because Location, ParserInputSource and a bunch of other things needed a Path, which a simple string doesn't have. Thus, this change. -- MOS_MIGRATED_REVID=95828839
* Build language: Implement integer divisionGravatar Laurent Le Brun2015-04-15
| | | | | -- MOS_MIGRATED_REVID=91192716
* Allow evaluation from StringGravatar Francois-Rene Rideau2015-04-13
| | | | | | | | | | | | | | | | | | | Lift the Evaluation code from the test files AbstractParserTestCase and AbstractEvaluationTestCase into new files EvaluationContext. Remove this code's dependency on FsApparatus (and thus to InMemoryFS), by making the Lexer accept null as filename. Also remove dependency on EventCollectionApparatus; parameterized by an EventHandler. Have the SkylarkSignatureProcessor use this Evaluation for defaultValue-s. While refactoring evaluation, have SkylarkShell use it, which fixes its ValidationEnvironment issues. TODO: refactor the tests to use this new infrastructure. -- MOS_MIGRATED_REVID=90824736
* Parser: Add Python 3 keywords.Gravatar Laurent Le Brun2015-03-24
| | | | | | | RELNOTES: Python 3 keywords are added to the lexer. They cannot be used as identifiers. -- MOS_MIGRATED_REVID=89301541
* Automated [] rollback of [].Gravatar Laurent Le Brun2015-03-20
| | | | | | | | | | | | | *** Reason for rollback *** This CL broke [] *** Original change description *** Skylark: New error message in the lexer when an unsupported Python keyword is used. -- MOS_MIGRATED_REVID=88954426
* Skylark: New error message in the lexer when an unsupported Python keyword ↵Gravatar Laurent Le Brun2015-03-18
| | | | | | | is used. -- MOS_MIGRATED_REVID=88930203
* Parser: Add the 'pass' keywordGravatar Laurent Le Brun2015-03-18
| | | | | -- MOS_MIGRATED_REVID=88857682
* Some cleanup changes.Gravatar Ulf Adams2015-03-05
| | | | | -- MOS_MIGRATED_REVID=87821306
* Introduce first class function signatures; make the parser use them.Gravatar Francois-Rene Rideau2015-02-19
| | | | | | | | | | | | | | | | | | | This is the first meaty cl in a series to refactor the Skylark function call protocol. 1- We introduce a first-class notion of FunctionSignature, that supports positional and named-only arguments, mandatory and optional, default values, type-checking, *stararg and **kwarg; 2- To keep things clean, we distinguish two different kinds of Argument's: Argument.Passed that appears in function calls, and Parameter, that appears in function definitions. 3- We refactor the Parser so it uses this infrastructure, and make minimal changes to MixedModeFunction so that it works with it (but don't actually implement *starparam and **kwparam yet). 4- As we modify FuncallExpression, we ensure that the args and kwargs arguments it passes to the underlying function are immutable, as a prerequisite to upcoming implementation of *starparam and **kwparam as being provided directly from a skylark list or dict. Further changes under review will take advantage of this FunctionSignature to redo all our function call protocol, to be used uniformly for both UserDefinedFunction's and builtin function. The result will be a simpler inheritance model, with better type-checking, builtin functions that are both simpler and better documented, and many redundant competing functionality-limited codepaths being merged and replaced by something better. NB: The changes to MixedModeFunction, SkylarkFunction and MethodLibrary are temporary hacks to be done away with in an upcoming CL. The rest is the actual changes. -- MOS_MIGRATED_REVID=86704072
* Cosmetic changes moved out of []Gravatar Francois-Rene Rideau2015-02-11
| | | | | | | | These shouldn't affect the semantic of the program in any significant way, but will hush the linter and other such metaprograms. -- MOS_MIGRATED_REVID=86089271
* Fix linter issues in lib/syntax.Gravatar Laurent Le Brun2015-02-09
| | | | | -- MOS_MIGRATED_REVID=85882605
* Update from Google.Gravatar Han-Wen Nienhuys2015-02-25
-- MOE_MIGRATED_REVID=85702957