IBM 2018 Senior Site Reliability Engineer (SRE) Internship in New York, New York

START AND END DATES FOR THIS INTERNSHIP ARE SPRING 2018 (6 Months), SUMMER 2018 (3 Months) and FALL 2018 (6 Months)At IBM we have an amazing opportunity to transform the world with cognitive technology. By using the vast amounts of information available today to identify new patterns and make new discoveries, we are helping cities become smarter, hospitals transform patient care, financial institutions minimize risk, and pharmaceuticals find cures for rare diseases. Join the forward-thinking teams at IBM solving some of the world’s most complex problems –there is no better place to launch your career!Site Reliability Engineer Interns work closely with Development to keep cloud deployed services operating and performing at levels both promised and expected by our customer base.Site Reliability Engineer Interns are in demand across IBM's growth areas. You'll be matched and deployed to a development team in a strategic business, based on your offered location and fit. These are office-based positions in IBM locations including:AZ - PhoenixCA - Almaden, Costa Mesa, Emeryville, Foster City, Redwood City, San Francisco, San JoseCO - DenverGA - AtlantaMA - Andover, Cambridge, LittletonMN - RochesterNC - Raleigh-DurhamNY - New York City, North Castle, Poughkeepsie, Yorktown HeightsOH - Cleveland, Dublin, HartlandOR - HillsboroPA - Blue Bell, PittsburghTX - Austin, DallasVT - Essex JunctionOpportunities in these locations will vary based on business demand.What You’ll Do:

  • You’ll work in an Agile, collaborative environment to deploy, monitor, and maintain systems, which will include software installations, updates, and core services.

  • You’ll automate repetitive and error prone tasks and processes, using tools like Ansible, Jenkins, Maven, Ant, Gradle, Chef, Puppet, Docker, UrbanCode, anda variety of scripting languages.

  • You'll ensure adequate monitoring is in place and enhance or adjust where needed, using tools like ElasticSearch, Prometheus, Marmot, NewRelic, and the IBM Cloud Monitoring Service.

  • You'll continuously measure the availability, latency and overall system health, using tools like Kibana, Grafana, Zabbix, and others.

  • You'll help with capacity planning to ensure continuous performance of the cloud systems.

  • You'll respond to incidents and drive change that prevents the same issue from re-occurring. You will also look for opportunities to automate the recovery for certain incidents that may be difficult to prevent.

  • You’ll design and implement tools for automated deployment and monitoring of multiple environments.

  • You’ll troubleshoot and resolve incidents.

Who You Are:

  • You are highly motivated and have a passion for ensuring scalable and highly-available products.

  • You have very strong verbal and written communication skills.

  • You are great at solving problems, debugging, and designing and implementing solutions to complex technical problems.

  • You are familiar with operating systems such as Linux, Windows, iOS and Android.

  • You have a basic understanding of programming/scripting in a language such as Java, Bash, Python, or Ruby.

  • Must have basic knowledge in one of the following technology areas: Java, Jenkins, Maven, Ant, Gradle, Chef, Puppet, Docker, Ansible, UrbanCode, Bash, Python, or Ruby

IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.