-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix range formatting issues #421
Conversation
@xorye I have not tested for the moment your PR but your demo looks like very impressive !!!
It's a problem since XMLFormatter is a singleton for all formatting of all files. If you have a 2 formatting which is done for 2 files you will have some trouble (ex: format a.xml which takes times and b.xml which is fomatted, the b.xml will override the .this.fullDocument of a.xml formatting, etc). Your code is cleaner than existing code, so I think we should keep it (setupRangeFormatting, setupFullFormatting, etc), but thoses methods should be moved to a new class. I mean
Please add javadoc for each methods. It looks like this: /**
* Copyright (c) 2018 Angelo ZERR
* All rights reserved. This program and the accompanying materials
* are made available under the terms of the Eclipse Public License v2.0
* which accompanies this distribution, and is available at
* http://www.eclipse.org/legal/epl-v20.html
*
* Contributors:
* Angelo Zerr <[email protected]> - initial API and implementation
*/
package org.eclipse.lsp4xml.services;
import java.util.ArrayList;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.eclipse.lsp4j.Position;
import org.eclipse.lsp4j.Range;
import org.eclipse.lsp4j.TextEdit;
import org.eclipse.lsp4xml.commons.BadLocationException;
import org.eclipse.lsp4xml.commons.TextDocument;
import org.eclipse.lsp4xml.dom.DOMAttr;
import org.eclipse.lsp4xml.dom.DOMCDATASection;
import org.eclipse.lsp4xml.dom.DOMComment;
import org.eclipse.lsp4xml.dom.DOMDocument;
import org.eclipse.lsp4xml.dom.DOMDocumentType;
import org.eclipse.lsp4xml.dom.DOMElement;
import org.eclipse.lsp4xml.dom.DOMNode;
import org.eclipse.lsp4xml.dom.DOMParser;
import org.eclipse.lsp4xml.dom.DOMProcessingInstruction;
import org.eclipse.lsp4xml.dom.DOMText;
import org.eclipse.lsp4xml.dom.DTDAttlistDecl;
import org.eclipse.lsp4xml.dom.DTDDeclNode;
import org.eclipse.lsp4xml.dom.DTDDeclParameter;
import org.eclipse.lsp4xml.services.extensions.XMLExtensionsRegistry;
import org.eclipse.lsp4xml.settings.XMLFormattingOptions;
import org.eclipse.lsp4xml.utils.XMLBuilder;
/**
* XML formatter support.
*
*/
class XMLFormatter {
private static final Logger LOGGER = Logger.getLogger(XMLFormatter.class.getName());
private static class XMLFormatterDocument {
private final TextDocument textDocument;
private final Range range;
private final XMLFormattingOptions options;
private int startOffset;
private int endOffset;
private DOMDocument fullDomDocument;
private DOMDocument rangeDomDocument;
private XMLBuilder xmlBuilder;
private int indentLevel;
public XMLFormatterDocument(TextDocument textDocument, Range range, XMLFormattingOptions options) {
this.textDocument = textDocument;
this.range = range;
this.options = options;
}
public List<? extends TextEdit> format() throws BadLocationException {
this.fullDomDocument = DOMParser.getInstance().parse(textDocument.getText(), textDocument.getUri(), null,
false);
if (range != null) {
setupRangeFormatting(range);
} else {
setupFullFormatting(range);
}
this.indentLevel = getStartingIndentLevel();
format(this.rangeDomDocument);
List<? extends TextEdit> textEdits = getFormatTextEdit();
return textEdits;
}
private void setupRangeFormatting(Range range) throws BadLocationException {
int startOffset = textDocument.offsetAt(range.getStart());
int endOffset = textDocument.offsetAt(range.getEnd());
Position startPosition = textDocument.positionAt(startOffset);
Position endPosition = textDocument.positionAt(endOffset);
enlargePositionToGutters(startPosition, endPosition);
this.startOffset = textDocument.offsetAt(startPosition);
this.endOffset = textDocument.offsetAt(endPosition);
String fullText = textDocument.getText();
String rangeText = fullText.substring(this.startOffset, this.endOffset);
this.rangeDomDocument = DOMParser.getInstance().parse(rangeText, textDocument.getUri(), null, false);
this.xmlBuilder = new XMLBuilder(this.options, "", textDocument.lineDelimiter(startPosition.getLine()));
}
private void setupFullFormatting(Range range) throws BadLocationException {
this.startOffset = 0;
this.endOffset = textDocument.getText().length();
this.rangeDomDocument = this.fullDomDocument;
Position startPosition = textDocument.positionAt(startOffset);
this.xmlBuilder = new XMLBuilder(this.options, "", textDocument.lineDelimiter(startPosition.getLine()));
}
private void enlargePositionToGutters(Position start, Position end) throws BadLocationException {
start.setCharacter(0);
end.setCharacter(this.textDocument.lineText(end.getLine()).length());
}
private int getStartingIndentLevel() throws BadLocationException {
DOMNode startNode = this.fullDomDocument.findNodeAt(this.startOffset);
if (startNode.isOwnerDocument()) {
return 0;
}
DOMNode startNodeParent = startNode.getParentNode();
if (startNodeParent.isOwnerDocument()) {
return 0;
}
// the starting indent level is the parent's indent level + 1
int startNodeIndentLevel = getNodeIndentLevel(startNodeParent) + 1;
return startNodeIndentLevel;
}
private int getNodeIndentLevel(DOMNode node) throws BadLocationException {
Position nodePosition = this.textDocument.positionAt(node.getStart());
String textBeforeNode = this.textDocument.lineText(nodePosition.getLine()).substring(0,
nodePosition.getCharacter() + 1);
int spaceOrTab = getSpaceOrTabStartOfString(textBeforeNode);
if (options.isInsertSpaces()) {
return (spaceOrTab / this.options.getTabSize());
}
return spaceOrTab;
}
private int getSpaceOrTabStartOfString(String string) {
int i = 0;
int spaceOrTab = 0;
while (i < string.length() && (string.charAt(i) == ' ' || string.charAt(i) == '\t')) {
spaceOrTab++;
i++;
}
return spaceOrTab;
}
private int getFullOffsetFromRangeOffset(int rangeOffset) {
return rangeOffset + this.startOffset;
}
private DOMElement getFullDocElemFromRangeElem(DOMElement elemFromRangeDoc) {
int fullOffset = getFullOffsetFromRangeOffset(elemFromRangeDoc.getEnd());
DOMElement elemFromFullDoc = (DOMElement) this.fullDomDocument.findNodeAt(fullOffset);
return elemFromFullDoc;
}
private boolean startTagExistsInRangeDocument(DOMNode node) {
if (!node.isElement()) {
return false;
}
return ((DOMElement) node).getStartTagOpenOffset() != DOMNode.NULL_VALUE;
}
private boolean startTagExistsInFullDocument(DOMNode node) {
if (!node.isElement()) {
return false;
}
DOMElement elemFromFullDoc = getFullDocElemFromRangeElem((DOMElement) node);
if (elemFromFullDoc == null) {
return false;
}
return elemFromFullDoc.getStartTagOpenOffset() != DOMNode.NULL_VALUE;
}
private void format(DOMNode node) throws BadLocationException {
if (node.getNodeType() != DOMNode.DOCUMENT_NODE) {
boolean doLineFeed;
if (node.getOwnerDocument().isDTD()) {
doLineFeed = false;
} else {
doLineFeed = !(node.isComment() && ((DOMComment) node).isCommentSameLineEndTag())
&& (!node.isText() || (!((DOMText) node).isWhitespace() && ((DOMText) node).hasSiblings()));
}
if (this.indentLevel > 0 && doLineFeed) {
// add new line + indent
if (!node.isChildOfOwnerDocument() || node.getPreviousNonTextSibling() != null) {
this.xmlBuilder.linefeed();
}
if (!startTagExistsInRangeDocument(node) && startTagExistsInFullDocument(node)) {
DOMNode startNode = getFullDocElemFromRangeElem((DOMElement) node);
int currentIndentLevel = getNodeIndentLevel(startNode);
this.xmlBuilder.indent(currentIndentLevel);
this.indentLevel = currentIndentLevel;
} else {
this.xmlBuilder.indent(this.indentLevel);
}
}
// generate start element
if (node.isElement()) {
DOMElement element = (DOMElement) node;
String tag = element.getTagName();
if (element.hasEndTag() && !element.hasStartTag()) {
// bad element which have not start tag (ex: <\root>)
xmlBuilder.endElement(tag, element.isEndTagClosed());
} else {
xmlBuilder.startElement(tag, false);
if (element.hasAttributes()) {
// generate attributes
List<DOMAttr> attributes = element.getAttributeNodes();
if (attributes.size() == 1) {
DOMAttr singleAttribute = attributes.get(0);
xmlBuilder.addSingleAttribute(singleAttribute.getName(),
singleAttribute.getOriginalValue());
} else {
for (DOMAttr attr : attributes) {
xmlBuilder.addAttribute(attr, this.indentLevel);
}
}
}
if (element.isStartTagClosed()) {
xmlBuilder.closeStartElement();
}
boolean hasElements = false;
if (node.hasChildNodes()) {
// element has body
// startElementClosed = true;
this.indentLevel++;
for (DOMNode child : node.getChildren()) {
boolean textElement = !child.isText();
hasElements = hasElements | textElement;
format(child);
}
this.indentLevel--;
}
if (element.hasEndTag()) {
if (hasElements) {
this.xmlBuilder.linefeed();
this.xmlBuilder.indent(this.indentLevel);
}
// end tag element is done, only if the element is closed
// the format, doesn't fix the close tag
if (element.hasEndTag() && element.getEndTagOpenOffset() <= this.endOffset) {
this.xmlBuilder.endElement(tag, element.isEndTagClosed());
} else {
this.xmlBuilder.selfCloseElement();
}
} else if (element.isSelfClosed()) {
this.xmlBuilder.selfCloseElement();
}
}
return;
} else if (node.isCDATA()) {
DOMCDATASection cdata = (DOMCDATASection) node;
this.xmlBuilder.startCDATA();
this.xmlBuilder.addContentCDATA(cdata.getData());
this.xmlBuilder.endCDATA();
} else if (node.isComment()) {
DOMComment comment = (DOMComment) node;
this.xmlBuilder.startComment(comment);
this.xmlBuilder.addContentComment(comment.getData());
this.xmlBuilder.endComment();
if (this.indentLevel == 0) {
this.xmlBuilder.linefeed();
}
} else if (node.isProcessingInstruction()) {
addPIToXMLBuilder(node, this.xmlBuilder);
if (this.indentLevel == 0) {
this.xmlBuilder.linefeed();
}
} else if (node.isProlog()) {
addPrologToXMLBuilder(node, this.xmlBuilder);
this.xmlBuilder.linefeed();
} else if (node.isText()) {
DOMText textNode = (DOMText) node;
// Generate content
String content = textNode.getData();
xmlBuilder.addContent(content, textNode.isWhitespace(), textNode.hasSiblings(),
textNode.getDelimiter(), this.indentLevel);
return;
} else if (node.isDoctype()) {
boolean isDTD = node.getOwnerDocument().isDTD();
DOMDocumentType documentType = (DOMDocumentType) node;
if (!isDTD) {
this.xmlBuilder.startDoctype();
List<DTDDeclParameter> params = documentType.getParameters();
for (DTDDeclParameter param : params) {
if (!documentType.isInternalSubset(param)) {
xmlBuilder.addParameter(param.getParameter());
} else {
xmlBuilder.startDoctypeInternalSubset();
xmlBuilder.linefeed();
// level + 1 since the 'level' value is the doctype tag's level
formatDTD(documentType, this.indentLevel + 1, this.endOffset, this.xmlBuilder);
xmlBuilder.linefeed();
xmlBuilder.endDoctypeInternalSubset();
}
}
if (documentType.isClosed()) {
xmlBuilder.endDoctype();
}
xmlBuilder.linefeed();
} else {
formatDTD(documentType, 0, this.endOffset, this.xmlBuilder);
}
return;
}
} else if (node.hasChildNodes()) {
// Other nodes kind like root
for (DOMNode child : node.getChildren()) {
format(child);
}
}
}
private static boolean formatDTD(DOMDocumentType doctype, int level, int end, XMLBuilder xmlBuilder) {
DOMNode previous = null;
for (DOMNode node : doctype.getChildren()) {
if (previous != null) {
xmlBuilder.linefeed();
}
xmlBuilder.indent(level);
if (node.isText()) {
xmlBuilder.addContent(((DOMText) node).getData().trim());
} else if (node.isComment()) {
DOMComment comment = (DOMComment) node;
xmlBuilder.startComment(comment);
xmlBuilder.addContentComment(comment.getData());
xmlBuilder.endComment();
} else if (node.isProcessingInstruction()) {
addPIToXMLBuilder(node, xmlBuilder);
} else if (node.isProlog()) {
addPrologToXMLBuilder(node, xmlBuilder);
} else {
boolean setEndBracketOnNewLine = false;
DTDDeclNode decl = (DTDDeclNode) node;
xmlBuilder.addDeclTagStart(decl);
if (decl.isDTDAttListDecl()) {
DTDAttlistDecl attlist = (DTDAttlistDecl) decl;
List<DTDAttlistDecl> internalDecls = attlist.getInternalChildren();
if (internalDecls == null) {
for (DTDDeclParameter param : decl.getParameters()) {
xmlBuilder.addParameter(param.getParameter());
}
} else {
boolean multipleInternalAttlistDecls = false;
List<DTDDeclParameter> params = attlist.getParameters();
DTDDeclParameter param;
for (int i = 0; i < params.size(); i++) {
param = params.get(i);
if (attlist.elementName.equals(param)) {
xmlBuilder.addParameter(param.getParameter());
if (attlist.getParameters().size() > 1) { // has parameters after elementName
xmlBuilder.linefeed();
xmlBuilder.indent(level + 1);
setEndBracketOnNewLine = true;
multipleInternalAttlistDecls = true;
}
} else {
if (multipleInternalAttlistDecls && i == 1) {
xmlBuilder.addUnindentedParameter(param.getParameter());
} else {
xmlBuilder.addParameter(param.getParameter());
}
}
}
for (DTDAttlistDecl attlistDecl : internalDecls) {
xmlBuilder.linefeed();
xmlBuilder.indent(level + 1);
params = attlistDecl.getParameters();
for (int i = 0; i < params.size(); i++) {
param = params.get(i);
if (i == 0) {
xmlBuilder.addUnindentedParameter(param.getParameter());
} else {
xmlBuilder.addParameter(param.getParameter());
}
}
}
}
} else {
for (DTDDeclParameter param : decl.getParameters()) {
xmlBuilder.addParameter(param.getParameter());
}
}
if (setEndBracketOnNewLine) {
xmlBuilder.linefeed();
xmlBuilder.indent(level);
}
if (decl.isClosed()) {
xmlBuilder.closeStartElement();
}
}
previous = node;
}
return true;
}
private List<? extends TextEdit> getFormatTextEdit() throws BadLocationException {
Position startPosition = this.textDocument.positionAt(this.startOffset);
Position endPosition = this.textDocument.positionAt(this.endOffset);
Range r = new Range(startPosition, endPosition);
List<TextEdit> edits = new ArrayList<>();
edits.add(new TextEdit(r, this.xmlBuilder.toString()));
return edits;
}
private static boolean isFirstChildNode(DOMNode node) {
return node.equals(node.getParentNode().getFirstChild());
}
private static boolean isPreviousSiblingNodeType(DOMNode node, short nodeType) {
DOMNode previousNode = node.getPreviousSibling();
return previousNode != null && previousNode.getNodeType() == nodeType;
}
private static void addPIToXMLBuilder(DOMNode node, XMLBuilder xml) {
DOMProcessingInstruction processingInstruction = (DOMProcessingInstruction) node;
xml.startPrologOrPI(processingInstruction.getTarget());
String content = processingInstruction.getData();
if (content.length() > 0) {
xml.addContentPI(content);
} else {
xml.addContent(" ");
}
xml.endPrologOrPI();
}
private static void addPrologToXMLBuilder(DOMNode node, XMLBuilder xml) {
DOMProcessingInstruction processingInstruction = (DOMProcessingInstruction) node;
xml.startPrologOrPI(processingInstruction.getTarget());
if (node.hasAttributes()) {
addAttributes(node, xml);
}
xml.endPrologOrPI();
}
/**
* Will add all attributes, to the given builder, on a single line
*/
private static void addAttributes(DOMNode node, XMLBuilder xmlBuilder) {
List<DOMAttr> attrs = node.getAttributeNodes();
if (attrs == null) {
return;
}
for (DOMAttr attr : attrs) {
xmlBuilder.addAttributesOnSingleLine(attr, true);
}
xmlBuilder.appendSpace();
}
}
public XMLFormatter(XMLExtensionsRegistry extensionsRegistry) {
}
public List<? extends TextEdit> format(TextDocument textDocument, Range range,
XMLFormattingOptions formattingOptions) {
try {
XMLFormatterDocument formatterDocument = new XMLFormatterDocument(textDocument, range, formattingOptions);
return formatterDocument.format();
} catch (BadLocationException e) {
LOGGER.log(Level.SEVERE, "Formatting failed due to BadLocation", e);
}
return null;
}
} |
@angelozerr, thanks for the amazing feedback, I have made the changes and made javadoc comments for each public method. |
@fbricon I pushed some changes to resolve this. If the range is detected to contain a portion of the start tag, the range will enlarge to contain the start tag, and then it will format. |
Signed-off-by: David Kwon <[email protected]>
Fixes #76
This PR fixes some problems with range formatting.
It is important to note that with this fix, the range of the highlighted region will have its start and end lines to be extended to the ends or the "gutters" of the document.
For example, this fix will consider this range:
to be identical to this range:
The most significant change made in
XMLFormatter.java
, was the introduction to more instance variables (which are set tonull
when formatting is done).Many methods in
XMLFormatter.java
rely on these instance variables. I am not entirely sure if this was a good idea in terms of design.For both selection formatting and full document formatting,
this.fullDomDocument
andthis.rangeDomDocument
are set.this.fullDomDocument
-> theDOMDocument
of the whole documentthis.rangeDomDocument
-> theDOMDocument
for the part of the document within the formatting rangeNote that for full document formatting,
this.fullDomDocument
andthis.rangeDomDocument
refer to the sameDOMDocument
object, because the formatting range is the whole document.Note: This could have been implemented differently, see the last paragraph of this PR message.
The reason why selection formatting requires
this.fullDomDocument
is because selection formatting requires contextual information from the whole document, in order to determine thelevel of indentation. Selection formatting requires information about nodes in
this.fullDomDocument
, which do not exist inthis.rangeDomDocument
.Since the nodes in
this.rangeDomDocument
is a subset of nodes inthis.fullDomDocument
, the only differences are that:this.rangeDomDocument
, would be the start of the selected range, wheras offset 0 forthis.fullDomDocument
would be the start of the whole document.The method
getFullOffsetFromRangeOffset(int offset)
converts an offset from a node inthis.rangeDomDocument
, to an equivalent offset for nodes inthis.fullDomDocument
.this.rangeDomDocument
may indicate that they do not have a starting or ending tag, even if the corresponding element inthis.fullDomDocument
has a starting and ending tag. For example, if the selected range contained only the end tag of an element, that corresponding element inthis.rangeDomDocument
believes that the element is missing a start tag, while the corresponding element inthis.fullDomDocument
knows that the element has both a start and end tag.Simple cases that are tested to work:
Case 1:
Selection formatting when the whole element (start tag, children, end tag) is selected.
The code looks at the parent element's indentation level, and formats accordingly.
Case 2:
Selection formatting when the start tag (with or without children) is selected.
Case 3:
Selection formatting when children are selected.
Case 4:
Selection formatting when the end tag and children are selected
How this works is that when the code realizes that
</properties>
does not have a start tag within the selected range, it looks for the start tagthis.fullDomDocument
. If found, the indentation of</properties>
will be set to match the indentation of its start tag.This is achieved here:
https:/xorye/lsp4xml/blob/ac6a02989bbc471e6ad2aa0fc7dfba3aac3039db/org.eclipse.lsp4xml/src/main/java/org/eclipse/lsp4xml/services/XMLFormatter.java#L224-L228
Case 5:
This case is the same as Case 4 except, a sibling of
properties
is also selected afterproperties
's ending tag.The formatter ensures that the indentation level for
</url>
matches the indentation level for</properties>
. This is achieved here: https:/xorye/lsp4xml/blob/ac6a02989bbc471e6ad2aa0fc7dfba3aac3039db/org.eclipse.lsp4xml/src/main/java/org/eclipse/lsp4xml/services/XMLFormatter.java#L228To explain why the indentation level (
this.indentLevel
) is now an instance variable rather than a parameter (as it was before this PR), is because in this fix, the indentation level should be able to decrease in special situations like this case (Case 5).For example:
When range formatting is initiated with the above range,
format()
is called with an initial indentation level of 2, because for line 3 to be properly formatted, it would need to be indented only twice. Since the initial indentation level is 2,the
format()
method will also indent its sibling on line 4, twice. Upon reaching line 5, this line of code:https:/xorye/lsp4xml/blob/ac6a02989bbc471e6ad2aa0fc7dfba3aac3039db/org.eclipse.lsp4xml/src/main/java/org/eclipse/lsp4xml/services/XMLFormatter.java#L228
correctly decreases the indentation level to 1, since everything on lines
2 and 3 are children of
properties
. Sinceurl
in line 6 is a sibling ofproperties
, it should have an indentation level that matchesproperties
. Sincethis.indentLevel
is an instance variable,it is assured that when the indentation level was set to 1, it will still be 1 by the time
format()
formatsurl
. If the indentation level was a method parameter instead, when the indentation level is decreased to1, the indentation level would remain at 2 by the time
format()
reachesurl
What does not work:
Case 5, but if
properties
has an indentation level of 0.The
format()
method will not add a newline if the indentation level is 0.https:/xorye/lsp4xml/blob/ac6a02989bbc471e6ad2aa0fc7dfba3aac3039db/org.eclipse.lsp4xml/src/main/java/org/eclipse/lsp4xml/services/XMLFormatter.java#L218-L221
There may definitely still be many other errors.
Another idea that may be less prone to errors:
Since the nodes in
this.rangeDomDocument
are a subset of nodes inthis.fullDomDocument
, get rid ofthis.rangeDomDocument
and have onlythis.fullDomDocument
, but keep track of the selected range.For each node in the selected range, format the nodes' indentations by determining the # of ancestors for that node (ie, if a node has 2 ancestors, have 2 indents). The problem with this way, is that we
do not get to easily reuse
format()
, which contains a lot of formatting logic already.