Pages

Friday, September 9, 2016

Sitecore 8 integration hooks with Solr 4.6 / ZK

Lately we ran into Solr stability issues in production losing the nodes out of the 3 nodes cluster.

We have a Solr 4.6  Zookeeper 3.5.8 running on Linux Centos machines. There are about 15 sitecore cores/collections configured on Solr instances.

3 nodes cluster setup
16GB of ram - JVM set at 4GB
Java 1.7

We noticed that sitecore_analytics_index core has a lot of records which is not a big deal for Solr. However, we are using the standard out-of-the-box configuration. Is there anything we need to config for Sitecore cores to perform better? What are the best practices for Sitecore with Solr (if any)?

Also, Sitecore is running very heavy queries (not sure which core is doing this) so it is eating a lot of JVM resources. This causes our solr instance to go down.

Some errors we are experiencing: 
  • ·         ERROR  SolrDispatchFilter             null:ClientAbortException: java.net.SocketException: Broken pipe
  • ·         SolrCore               [sitecore_marketingdefinitions_web] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
  •  ·         ERROR java.lang.OutOfMemoryError: Requested array size exceeds VM limit 
  •  ·         ERROR  SolrCmdDistributor         org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2,? try again later.
     Solutions applied: 
I    Increased the memory footprint on all the 3 Linux servers to 32 GB RAM and 16 GB JVM
   1.       Login on the Linux server as ‘root’
   2.       sudo service tomcat stop
   3.       sudo kill -9 pid (tomcat) – only if tomcat did not stop gracefully
   4.       Navigate to /opt/tomcat/bin
   5.       Create a new setenv.sh file
   6.       Edit the file and add

  export JAVA_OPTS="$JAVA_OPTS\
  -Xms12288m\
  -Xmx12288m\
  -XX:+HeapDumpOnOutOfMemoryError\
  -XX:HeapDumpPath=/var/log/\
  -XX:MaxPermSize=1024m\
  -XX:MaxNewSize=2048m\
  -XX:NewSize=2048m"

   7.       Save the file
   8.       sudo service tomcat start
   9.       Navigate to solr1.environment.pmi.org:8080/solr
   10.   Ensure the Physical Memory and JVM-Memory has changed.

    The Solr was restored and was stable for a few days but the errors started creeping up and we saw the same behavior. 

    We captured the catalina logs from tomcat/logs folder and did indepth analysis. 
    We found that all the 5 Sitecore web instances were configured to logs the analytics index Therefore Solr was getting bombarded with the exceptions and the sitecore_analytics_index was growing enormously. (1GB every week ).

   Some of the most common issues we notice in production logs: 

    1.       Why is sitecore retrieving MAX rows available? This is very costly for solr.
454893853 [http-bio-8080-exec-4226] INFO  org.apache.solr.core.SolrCore  â€“ [sitecore_web_index] webapp=/solr path=/select params={q=(slug_s:(\/certifications\/types\/pmp)+AND+_latestversion:(True))&fq=_indexname:(sitecore_web_index)&version=2.2&rows=2147483647} hits=0 status=0 QTime=0
454877084 [http-bio-8080-exec-4197] INFO  org.apache.solr.core.SolrCore  â€“ [sitecore_web_index] webapp=/solr path=/select params={q=(slug_s:[*+TO+*]+AND+_group:(25e875fea5fa4ff3821a1413d4ee3bc1))&fq=_indexname:(sitecore_web_index)&version=2.2&rows=2147483647} hits=2 status=0 QTime=7
454877096 [http-bio-8080-exec-4225] INFO  org.apache.solr.core.SolrCore  â€“ [sitecore_web_index] webapp=/solr path=/select params={q=(slug_s:[*+TO+*]+AND+_group:(25e875fea5fa4ff3821a1413d4ee3bc1))&fq=_indexname:(sitecore_web_index)&version=2.2&rows=2147483647} hits=2 status=0 QTime=0
   
     The code base that generates this query has been amended, it should now request only 1 row. Additionally it will call more selectively to decue the number of calls. Fixe will be distributed in next code delivery.


     2.       Why is sitecore retrieving all fields and also doing sorting on date fields. This is very costly for solr. (Eg: SELECT *.* )
454877030 [http-bio-8080-exec-4220] INFO  org.apache.solr.core.SolrCore  â€“ [sitecore_web_index] webapp=/solr path=/select params={facet=true&sort=publicationdate_tdt+desc&fl=*,score&start=0&q=*:*&f.contenttype_facet_s.facet.mincount=1&facet.field=contentsources_facet_sm&facet.field=topicsfacet_sm&facet.field=contenttype_facet_s&fq=((((((((_templates:(51fe426158da421da104f3cbc23e328d)+AND+-_templates:(216357ebc69e46ceb912707c2dba28a1))+AND+_path:(110d559fdea542ea9c1c8a5df7e70ef9))+AND+-excludefromsearch_b:(True))+AND+_latestversion:(True))+AND+_language:(en))+AND+publicationdate_year_tl:[-2147483648+TO+2012])+AND+publicationdate_year_tl:[2005+TO+2147483647])+AND+((topicsfacet_sm:("Portfolio+Management")+AND+topicsfacet_sm:("Scope+Management"))+AND+topicsfacet_sm:("Program+Management")))&fq=_indexname:(sitecore_web_index)&version=2.2&f.topicsfacet_sm.facet.mincount=1&f.contentsources_facet_sm.facet.mincount=1&rows=10} hits=2 status=0 QTime=5
     
     The default behaviour of Sitecore is to get all the fields of the document. In the sample above we do this type of call with rows=10, so it not expensive. The sorting is necessary as this query feeds the KAS articles listing, still is cheaper to sort in Solr than sor in memory in Sitecore. Alternatively a new core could be setup with fewer fields.

     3.       Why is sitecore doing hard commits directly on the index. Solr does not recommend because there could be open searches that are not available.
54956494 [http-bio-8080-exec-4178] INFO  org.apache.solr.update.processor.LogUpdateProcessor  â€“ [sitecore_analytics_index] webapp=/solr path=/update params={waitSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false} {} 0 13566
454956495 [http-bio-8080-exec-4178] ERROR org.apache.solr.core.SolrCore  â€“ org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later.
    
      This is managed by Sitecore internally. We have added configuration fixes as currently the logs reveal that all the 5 servers are committing to sitecore_analytics_index. After the fix only CM server will doing this.

     4.         Solr does not recommend to optimize the index instantly. Especially for cores like sitecore_analytics_index.
455450829 [http-bio-8080-exec-4167] INFO  org.apache.solr.update.processor.LogUpdateProcessor  â€“ [sitecore_analytics_index] webapp=/solr path=/update params={optimize=true&waitSearcher=true&maxSegments=1&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2} {optimize=} 0 155892
    
      This happens in 2 occasions: one automatically only when a full rebuild is executed or by Sitecore scheduled agent which currently it is triggered on sitecore_master_index

    sitecore_analytics_index Indexes all the visitors interactions, and it is not populated by Sitecore content tree data. We have added configuration fixes as currently the logs reveal that all the 5 servers are committing to sitecore_analytics_index. After the fix only CM server will doing this. It looks like a lot of empty were produced in the index due to 4x deletions at the time



Thursday, March 17, 2016

Updating Application Pool Settings using powershell on a server stack

This is a powershell script that allows you to execute remotely to update the Application Pool settings on a website. Here I am trying to update the .Net version to 4.0.

$environment = (Read-Host 'Which environment is this? (localhost, erpint, erpqa, staging, candidate, production)').toLower()
# check if user enter incorrect environment

for( $i = 1; $i -le 3; $i++ ){  #loop thru each node

$server = $null
Write-Host $environment

            $server = $environment + "wfe" + $i + ".pmienvs.pmihq.org"   #construct the server fqdn to remote connect. This will different for you

Write-Host "Remote Executing on $server ..."
       
        invoke-command -computername $server {
            Import-Module WebAdministration
            $iisAppPoolName = "yourwebsite.org"
            $iisAppPoolDotNetVersion = "v4.0"

            #navigate to the app pools root
            cd IIS:\AppPools\

            #check if the app pool exists
            if (!(Test-Path $iisAppPoolName -pathType container))
            {
                #App pool does not exist
                "App Pool $iisAppPoolName does not exist. Skipping updates."
            }
            else
            {
                #update the app pool settings
                $appPool = Get-Item $iisAppPoolName
                $appPool | Set-ItemProperty -Name "managedRuntimeVersion" -Value $iisAppPoolDotNetVersion
                "App Pool $iisAppPoolName updated."
            }
        }
}

Friday, October 16, 2015

Using Telerik Kendo UI ASP.Net MVC controls (EditorFor)


Prerequisites:

I am assuming that you have installed ASP.Net KendoUI dll (Kendo.Mvc.dll).
Added a reference in your project.
Also ensure that you have included all the javascript kendo js library and css library files in your Visual Studio Project.

Add a view and put this in your razor view cshtml file.
Note that the Model must have all the properties declared and must match the names used.

@model Models.EmailModel

 <script type="text/javascript" src="@Url.Content("~/Scripts/kendo/jquery.min.js")"></script>
<script type="text/javascript" src="@Url.Content("~/Scripts/kendo/kendo.web.min.js")"></script>
<script type="text/javascript" src="@Url.Content("~/Scripts/kendo/kendo.aspnetmvc.min.js")"></script>

  @using (Html.BeginForm("SendEmail", "Email", FormMethod.Post))
    {
             <div>
        <fieldset>
            @Html.ValidationMessageFor(x => x.HtmlContent, "Email content cannot be empty.")
            @(Html.Kendo().EditorFor(m => m.HtmlContent)
            .Encode(false)
            .Name("HtmlContent")
            .HtmlAttributes(new { style = "height:440px" })
            .Tools(tools => tools
                .Clear()
                .Bold().Italic().Underline().Strikethrough()
                .JustifyLeft().JustifyCenter().JustifyRight().JustifyFull()
                .InsertUnorderedList().InsertOrderedList()
                .CreateLink().Unlink()
              )
            .Value(Model.HtmlContent)
            )
        </fieldset>
        </div>
        <div  style="float:right">
            @(Html.Kendo().Button().Name("sendButton").Content("Send Email"))
        </div>
    }
    <div  style="float:right">
        @(Html.Kendo().Button().Name("cancelButton").Content("Cancel").Events(ev => ev.Click("onClickCancel")))
    </div>

Saturday, June 20, 2015

Typical Exhibitor.properties config

#Auto-generated by Exhibitor
#Fri Jun 19 18:40:19 EDT 2015
com.netflix.exhibitor-rolling-hostnames=
com.netflix.exhibitor-rolling.zookeeper-data-directory=/var/lib/zookeeper/data
com.netflix.exhibitor-rolling.servers-spec=1\:192.168.174.147
com.netflix.exhibitor.java-environment=
com.netflix.exhibitor.zookeeper-data-directory=/var/lib/zookeeper/data
com.netflix.exhibitor-rolling-hostnames-index=0
com.netflix.exhibitor-rolling.java-environment=
com.netflix.exhibitor-rolling.observer-threshold=999
com.netflix.exhibitor.servers-spec=S\:10\:192.168.174.147,S\:11\:192.168.174.187,S\:12\:192.168.174.188
com.netflix.exhibitor.cleanup-period-ms=43200000
com.netflix.exhibitor.auto-manage-instances-fixed-ensemble-size=3
com.netflix.exhibitor.zookeeper-install-directory=/opt/zookeeper
com.netflix.exhibitor.check-ms=30000
com.netflix.exhibitor.zookeeper-log-directory=/var/lib/zookeeper/logs
com.netflix.exhibitor-rolling.auto-manage-instances=0
com.netflix.exhibitor-rolling.cleanup-period-ms=43200000
com.netflix.exhibitor-rolling.auto-manage-instances-settling-period-ms=180000
com.netflix.exhibitor-rolling.check-ms=30000
com.netflix.exhibitor.log-index-directory=
com.netflix.exhibitor-rolling.log-index-directory=
com.netflix.exhibitor.backup-period-ms=60000
com.netflix.exhibitor-rolling.connect-port=2888
com.netflix.exhibitor-rolling.election-port=3888
com.netflix.exhibitor-rolling.backup-extra=
com.netflix.exhibitor.client-port=2181
com.netflix.exhibitor-rolling.zoo-cfg-extra=syncLimit\=5&tickTime\=2000&initLimit\=10&autopurge.snapRetainCount\=3&autopurge.purgeInterval\=1
com.netflix.exhibitor-rolling.zookeeper-install-directory=/opt/zookeeper
com.netflix.exhibitor.cleanup-max-files=3
com.netflix.exhibitor-rolling.auto-manage-instances-fixed-ensemble-size=3
com.netflix.exhibitor-rolling.backup-period-ms=60000
com.netflix.exhibitor-rolling.client-port=2181
com.netflix.exhibitor.backup-max-store-ms=86400000
com.netflix.exhibitor-rolling.cleanup-max-files=3
com.netflix.exhibitor-rolling.backup-max-store-ms=86400000
com.netflix.exhibitor.connect-port=2888
com.netflix.exhibitor.backup-extra=
com.netflix.exhibitor.observer-threshold=999
com.netflix.exhibitor.log4j-properties=\# Define some default values that can be overridden by system properties\nzookeeper.root.logger\=INFO, FILE\nzookeeper.console.threshold\=INFO, FILE\nzookeeper.log.dir\=.\nzookeeper.log.file\=zookeeper.log\nzookeeper.log.threshold\=INFO,FATAL\nzookeeper.tracelog.dir\=.\nzookeeper.tracelog.file\=zookeeper_trace.log\n\n\#\n\# ZooKeeper Logging Configuration\n\#\n\n\# Format is "<default threshold> (, <appender>)+\n\n\# DEFAULT\: console appender only\n\# log4j.rootLogger\=${zookeeper.root.logger}\n\n\# Example with rolling log file\nlog4j.rootLogger\=INFO, CONSOLE, ROLLINGFILE\n\n\# Example with rolling log file and tracing\n\#log4j.rootLogger\=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE\n\n\#\n\# Log INFO level and above messages to the console\n\#\nlog4j.appender.CONSOLE\=org.apache.log4j.ConsoleAppender\nlog4j.appender.CONSOLE.Threshold\=${zookeeper.console.threshold}\nlog4j.appender.CONSOLE.layout\=org.apache.log4j.PatternLayout\nlog4j.appender.CONSOLE.layout.ConversionPattern\=%d{ISO8601} [myid\:%X{myid}] - %-5p [%t\:%C{1}@%L] - %m%n\n\n\#\n\# Add ROLLINGFILE to rootLogger to get log file output\n\#    Log DEBUG level and above messages to a log file\nlog4j.appender.ROLLINGFILE\=org.apache.log4j.RollingFileAppender\nlog4j.appender.ROLLINGFILE.Threshold\=${zookeeper.log.threshold}\nlog4j.appender.ROLLINGFILE.File\=${zookeeper.log.dir}/${zookeeper.log.file}\n\n\# Max log file size of 10MB\nlog4j.appender.ROLLINGFILE.MaxFileSize\=10MB\n\# uncomment the next line to limit number of backup files\n\#log4j.appender.ROLLINGFILE.MaxBackupIndex\=10\n\nlog4j.appender.ROLLINGFILE.layout\=org.apache.log4j.PatternLayout\nlog4j.appender.ROLLINGFILE.layout.ConversionPattern\=%d{ISO8601} [myid\:%X{myid}] - %-5p [%t\:%C{1}@%L] - %m%n\n\n\n\#\n\# Add TRACEFILE to rootLogger to get log file output\n\#    Log DEBUG level and above messages to a log file\nlog4j.appender.TRACEFILE\=org.apache.log4j.FileAppender\nlog4j.appender.TRACEFILE.Threshold\=TRACE\nlog4j.appender.TRACEFILE.File\=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}\n\nlog4j.appender.TRACEFILE.layout\=org.apache.log4j.PatternLayout\n\#\#\# Notice we are including log4j's NDC here (%x)\nlog4j.appender.TRACEFILE.layout.ConversionPattern\=%d{ISO8601} [myid\:%X{myid}] - %-5p [%t\:%C{1}@%L][%x] - %m%n\n
com.netflix.exhibitor.auto-manage-instances-apply-all-at-once=1
com.netflix.exhibitor.election-port=3888
com.netflix.exhibitor-rolling.auto-manage-instances-apply-all-at-once=1
com.netflix.exhibitor.zoo-cfg-extra=syncLimit\=5&tickTime\=2000&initLimit\=10&autopurge.snapRetainCount\=3&autopurge.purgeInterval\=1
com.netflix.exhibitor-rolling.zookeeper-log-directory=/var/lib/zookeeper/logs
com.netflix.exhibitor.auto-manage-instances-settling-period-ms=180000
com.netflix.exhibitor-rolling.log4j-properties=\# Define some default values that can be overridden by system properties\nzookeeper.root.logger\=INFO, FILE\nzookeeper.console.threshold\=INFO, FILE\nzookeeper.log.dir\=.\nzookeeper.log.file\=zookeeper.log\nzookeeper.log.threshold\=INFO,FATAL\nzookeeper.tracelog.dir\=.\nzookeeper.tracelog.file\=zookeeper_trace.log\n\n\#\n\# ZooKeeper Logging Configuration\n\#\n\n\# Format is "<default threshold> (, <appender>)+\n\n\# DEFAULT\: console appender only\n\# log4j.rootLogger\=${zookeeper.root.logger}\n\n\# Example with rolling log file\nlog4j.rootLogger\=INFO, CONSOLE, ROLLINGFILE\n\n\# Example with rolling log file and tracing\n\#log4j.rootLogger\=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE\n\n\#\n\# Log INFO level and above messages to the console\n\#\nlog4j.appender.CONSOLE\=org.apache.log4j.ConsoleAppender\nlog4j.appender.CONSOLE.Threshold\=${zookeeper.console.threshold}\nlog4j.appender.CONSOLE.layout\=org.apache.log4j.PatternLayout\nlog4j.appender.CONSOLE.layout.ConversionPattern\=%d{ISO8601} [myid\:%X{myid}] - %-5p [%t\:%C{1}@%L] - %m%n\n\n\#\n\# Add ROLLINGFILE to rootLogger to get log file output\n\#    Log DEBUG level and above messages to a log file\nlog4j.appender.ROLLINGFILE\=org.apache.log4j.RollingFileAppender\nlog4j.appender.ROLLINGFILE.Threshold\=${zookeeper.log.threshold}\nlog4j.appender.ROLLINGFILE.File\=${zookeeper.log.dir}/${zookeeper.log.file}\n\n\# Max log file size of 10MB\nlog4j.appender.ROLLINGFILE.MaxFileSize\=10MB\n\# uncomment the next line to limit number of backup files\n\#log4j.appender.ROLLINGFILE.MaxBackupIndex\=10\n\nlog4j.appender.ROLLINGFILE.layout\=org.apache.log4j.PatternLayout\nlog4j.appender.ROLLINGFILE.layout.ConversionPattern\=%d{ISO8601} [myid\:%X{myid}] - %-5p [%t\:%C{1}@%L] - %m%n\n\n\n\#\n\# Add TRACEFILE to rootLogger to get log file output\n\#    Log DEBUG level and above messages to a log file\nlog4j.appender.TRACEFILE\=org.apache.log4j.FileAppender\nlog4j.appender.TRACEFILE.Threshold\=TRACE\nlog4j.appender.TRACEFILE.File\=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}\n\nlog4j.appender.TRACEFILE.layout\=org.apache.log4j.PatternLayout\n\#\#\# Notice we are including log4j's NDC here (%x)\nlog4j.appender.TRACEFILE.layout.ConversionPattern\=%d{ISO8601} [myid\:%X{myid}] - %-5p [%t\:%C{1}@%L][%x] - %m%n\n
com.netflix.exhibitor.auto-manage-instances=0

Thursday, May 14, 2015

Building and installing Exhibitor WAR using Maven

-- Dependencies required for building Exhibitor

-- Launch CentOS console via VMPlayer
-- login as root/Password1
$ yum install git
$ yum install maven
$ quit

-- login as admin/Password

$ sudo mkdir ~/temp
$ cd ~/temp
$ sudo git clone git://github.com/Netflix/exhibitor.git
$ cd exhibitor/exhibitor-standalone/src/main/resources/buildscripts/war/maven
$ sudo vi src/main/resources/exhibitor.properties
-- uncommented the two prop lines, port to 8080 and then save the file
exhibitor-configtype=file
exhibitor-port=8080
$ sudo vi pow.xml
--  change the version to 1.5.5 and save the file

$ sudo mvn clean compile war:war

-- The exhibitor war file is built and ready to use.
-- Next Steps (if required)

$ sudo cp exhibitor/exhibitor-standalone/src/main/resources/buildscripts/war/maven/target/exhibitor-war-1.0.war /opt/tomcat/webapps/ROOT.war
$ sudo /opt/tomcat/bin/startup.sh start

Verify http://192.168.174.128:8080/exhibitor/v1/ui/index.html


Note:
Found the root cause of version not rendering on the Exhibitor landing page.Its says "dev" on the right hand side corner instead of the actual version.
If you use the 1.5.1 and 1.5.2 it works and displays the version.  
Based on the artifact version in pom.xml file maven downloads files from http://repo.maven.apache.org/maven2/com/netflix/exhibitor/exhibitor-standalone/1.5.2/exhibitor-standalone-1.5.2.pom
compare with 1.5.3
http://repo.maven.apache.org/maven2/com/netflix/exhibitor/exhibitor-standalone/1.5.3/exhibitor-standalone-1.5.3.pom
the scope is runtime vs compile
therefore its not compiling for 1.5.3 or higher
so it all depends on how soon netflix would fix it and upload file.

Thursday, April 30, 2015

Knowledge Base : Apache Solr 4.6 and Zookeeper 3.4.5


Issue:   Zookeeper Leader Conflict

Exception: SEVERE: ClusterState says we are the leader, but locally we don't think so*"

Cause: Zookeeper session on the leader node expires sooner than expected and its goes haywire. After this happens, leader(001) is going into 'recovery' mode and all the index updates are failing with "503- service unavailable" error message.

Resolution:

Increase the Zookeeper session timeouts in solr.xml.
Recycled Node 2 and Node 3 in sequence.
sudo service tomcat stop
sudo service tomcat start

-----------------------------------------------------------------------------------------------------------------------

Issue:  Zookeeper autopurge is not working

Exception:  java.io.FileNotFoundException: ./zookeeper.log (Permission denied)

Cause: Following paths in/opt/zookeeper/conf/zoo.cfg are incorrectly configured
dataDir=/opt/zookeeper/data
dataLogDir=/opt/zookeeper/data

Resolution:
1 .Stop Zk on all nodes. Edit  /opt/zookeeper/conf/zoo.cfg
2. Update directly from ZK UI and Commit
  • dataDir=/var/lib/zookeeper/data
  • dataLogDir=/var/lib/zookeeper/logs
  • autopurge.snapRetainCount=3
  • autopurge.purgeInterval=1
3.Run this on command line to give permissions
  • sudo chown -R tomcat /var/lib/zookeeper/data/myid
  • sudo chown -R tomcat /var/lib/zookeeper/logs/zookeeper.log
4. Restart tomcat on all nodes
References used:
New in 3.4.0: When enabled, ZooKeeper auto purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDirrespectively and deletes the rest. Defaults to 3. Minimum value is 3.
autopurge.purgeInterval
(No Java system property)
New in 3.4.0: The time interval in hours for which the purge task has to be triggered. Set to a positive integer (1 and above) to enable the auto purging. Defaults to 0.

-------------------------------------------------------------------------------------------------------------------------


Thursday, November 29, 2012

Generating Sequence Diagram from Visual Studio

If you have Visual Studio 2010 Ultimate edition installed on your machine, you could generate a cool sequence diagram from the code base.
Open the cs file where you have public methods declared.

Highlight the method name
Right click and click on 'Generate Sequence diagram' .
Visual studio automatically generates a sequence diagram for you in a new window.